June 19, 2013

Workshop with Uncle Bob at Norwegian Development Conference

Some of my notes from workshop with Uncle Bob at Norwegian Development Conference

About the future. 
Look out of a window. All those cars that you see there, they all have software. Think about it. This should terrify you. Sooner or later a day will come when software bug will cause hundreds of people to die. Politicians will step in and ask why we didn't prevent it. We should have an answer ready. Otherwise, they will impose rules on us - use only this language, or that construct. They already disallow dynamic memory allocation for airplane software.

About TDD.
Software is fragile. If you take a piece of software, it is possible to find a bit that if you flip its value, a software will crash. Most engineers do not have the same problem: if you take a brick out of the building, it will not fall down. There is, however, another industry that has similar problem - a mistake in a single digit for an accountant may cause a company to go bankrupt. The solution that accountants have been using for 500 years is double-entry bookkeeping. TDD is our double-entry bookkeeping. Accountants have managers too, and they have deadlines. But they never say, "OK, today I will register only debits, and after the deadline I will go and fix all the credits (since it is possible to recreate those based on registered debits)".

About functional programming.
There is no assignment in functional programming. Is it good? Well, no assignment, no concurrent update problem. Manufacturers can't increase the speed of a single CPU any more, so they started to put more cores. Imagine a machine with 128 cores. Or 1024 cores. Will the OS make sure that no concurrent updates are possible? Maybe yes, maybe not. But functional programs can.

About DLL and project decomposition.
We invented DLLs to be able to compose application from components that can be deployed independently. Do you follow this rule when you decompose your application into a set of DLLs? [Note: that is why interfaces should be part of an application DLL, but implementations can live in different DLLs].

About stable and unstable components and UI automation.
Components that have many incoming dependencies are stable and independent. Components that have many outgoing dependencies are unstable and dependent. The more stable the component is, the more abstractions it has to have. The more unstable the component is, the more concrete classes it usually has. UI has many outgoing dependencies. Hence, it is unstable. Do you write tests via UI? Well, then you base something that has to be stable on something that is unstable.

About architecture.
Good architecture defers important decisions (e.g. which database to use). Web is only a delivery mechanism and thus architecture should not be build around the fact that application is a web application. Database is a detail, and architecture should not be build around the fact that application uses database to store content.
http://www.linkedin.com/in/oldbam

May 17, 2013

Generating Exception Classes in C# with T4

Being inspired by Marten Range's talk about T4 (Text Template Transformation Toolkit) on dotnetrocks, I decided to try it out for a problem that Marten mentioned in the beginning of the talk and that I have recently encountered at work - generating exception classes in C#. According to MSDN documentation, "derived exception classes should define four constructors". If you have more than one custom exception in your class, coding them manually is a boring task. So, I tried to figure out how much time will it take me to learn how to solve this small problem with T4.

Let's assume we have a program that requires to input string between 1 and 3 characters long, and throws two different exceptions if input is shorter or longer:

static void Main(string[] args)
{
    Console.Write("Input a string longer than 1 chars but shorter than 3 chars: ");
    string input = Console.ReadLine();
    if (input.Length < 2) throw new InputStringTooShortException();
    else if (input.Length > 2) throw new InputStringTooLongException();
    else Console.WriteLine("Thank you!");
}

The first solution that came to my mind was to have a master template file which would use a token instead of exception type name and then specify value for this token by consuming template files.

Let's create the template class that will use provided name of the exception type and generate all the constructors.
ExceptionTemplate.tt:

<#@ output extension=".cs" #>
using System;

[Serializable()]
public class <#= ExceptionTypeName #>Exception : ApplicationException
{
    public <#= ExceptionTypeName #>Exception() : base() { }
    public <#= ExceptionTypeName #>Exception(string message) : base(message) { }
    public <#= ExceptionTypeName #>Exception(string message, Exception inner) : base(message, inner) { }
    protected <#= ExceptionTypeName #>Exception(
        System.Runtime.Serialization.SerializationInfo info,
        System.Runtime.Serialization.StreamingContext context) { }
}

<#+ string ExceptionTypeName = string.Empty; #>

Within this template we used expression block within markers <#= and #> to denote exception type name that is evaluated, converted to a string, and written as an output. We also used class feature syntax within markers <#+ and #> to provide class member ExceptionTypeName that consuming templates will provide as follows:

InputStringTooShortException.tt:

<#@ template language="C#" #>
<# this.ExceptionTypeName = "InputStringTooShort"; #>
<#@include File="ExceptionTemplate.tt" #>

InputStringTooLongException.tt:

<#@ template language="C#" #>
<# this.ExceptionTypeName = "InputStringTooLong"; #>
<#@include File="ExceptionTemplate.tt" #>

Above solution looks straightforward, but the problem with it is that we need one additional template per exception. We can avoid this by making modification to our ExceptionTemplate.tt to have a method that will accept exception type name as a parameter: 

<#@ output extension=".cs" #>
using System;

<#+ void GenerateException(string exceptionTypeName) { #>

    [Serializable()]
    public class <#= exceptionTypeName #>Exception : ApplicationException
    {
        public <#= exceptionTypeName #>Exception() : base() { }
        public <#= exceptionTypeName #>Exception(string message) : base(message) { }
        public <#= exceptionTypeName #>Exception(string message, Exception inner) : base(message, inner) { }
        protected <#= exceptionTypeName #>Exception(
            System.Runtime.Serialization.SerializationInfo info,
            System.Runtime.Serialization.StreamingContext context) { }
    }

<#+ } #>
Now, we need only one template for both exceptions that we want to generate:
GeneratedExceptions.tt:

<#@ template language="C#" #>
<#@include File="ExceptionTemplate.tt" #>

<# GenerateException("InputStringTooLong"); #>
<# GenerateException("InputStringTooShort"); #>

With this solution, we store all exception classes within one file, but if you need to store them in separate files, you can use a trick described in Oleg Sych's blog. By the way, his blog contains lots of information about using T4 ;)


http://www.linkedin.com/in/oldbam

August 28, 2011

Security Issues in OpenStack Object Storage

Presenting the results of Master thesis project concluded my 2-year study program at NordSecMob Master's Programme in Security and Mobile Computing. During my thesis project I analyzed security issues in OpenStack Object Storage - an open source cloud storage software. Even though I worked on the project in Norway, the presentation itself was given in Denmark.

Some of the findings were quite interesting. For example, isolation of files belonging to different users is implemented using hashing algorithm. Your account name, file name, and directory where the file is stored are combined with secret hash and passed to MD5 hash function. The output determines the location of file on the server. When we changed MD5 to a dummy hash function that returned the same value for every input, users could read and even overwrite files belonging to other users. Even though MD5 is resistant to pre-image attacks (we can't find input that will hash to a known output), MD5 is not resistant to collision attacks (we can find two inputs that will hash to the same value.)

The isolation approach allows an interesting attack to be executed against a cloud provider that uses OpenStack. First, attacker negotiates contractual agreement with the provider according to which the latter is responsible to prevent loss of data of the former. Second, attacker generates two file names that will hash to the same value. Then, attacker uploads these two files to OpenStack and second file will overwrite the first one. Now, attacker can sue provider for data loss. OpenStack has a mitigation against such an attack, which is a secret hash value that is stored on the server and used as a salt to hash function. But with an insider knowledge of this hash the aforementioned attack is possible.

Another issue we found is scary. In the default authorization system that ships with OpenStack Object Storage, there is one type of administrators that can download or even delete files belonging to any user on any of the accounts. This admin is called Reseller Admin, and of course his broad permissions are mentioned nowhere in the documentation. So, if you your company has divisions in US and Europe, and you are a Reseller Admin in, say, Germany, you can view files belonging to users in US. Quite cool, isn't it?

Full text of the thesis is available at this link. The presentation slides are here.




http://www.linkedin.com/in/oldbam

February 21, 2010

From Proof-Carrying Code to Agent Technologies

I am taking a course on "Language-Based Security" at Danish Technical University. After listening a couple of lectures on Proof-carrying code (see [Nec97]) we were given a task to devise a proof-carrying architecture (PCA) for data exchanges. The idea of the PCA is the following. The data provider specifies the security policy for the data consumer to obey, and the data consumer provides a proof that he can satisfy the defined requirements. For example, in case of a customer submitting his credit card data to the web-shop, the former might demand from the latter to use specific data encryption algorithm, say 3DES, and to check that all communication partners encrypt data before sending it on.

The problem with the above requirements is the control over the data. Even if the web-shop can provide a proof that it is capable of using 3DES on the way from the customer to the web-shop, there are no guarantees that on the way from the web-shop to the bank the same security standards will be used. While thinking over the possible implementations of the PCA, I came to the conclusion that a possible solution lays in utilization of the agent technologies.

Here is the description how this scheme can work. The customer creates an agent, sets data and specifies the security policies for the data transfer and manipulation. An example of such a security policy might be the following:
  • Encrypt data before sending over the web using Triple-DES
  • Allow Visa payment provider to retrieve the credit card data
  • Allow web-shop to redirect the agent to the payment provider
Before executing operations (eg. "get data") the agent will check the caller's credentials to verify that such an operation is allowed for the given caller. The idea behind this is that the web-site does not really need to see the credit card data, all it wants is the confirmation from his bank or payment provider that money were transfered. That is why, the agent allows web-site to redirect itself to the payment provider, but not to get the actual credit card details. And of course additional data (eg. account details to which the money should be transfered) can be set upon an agent with its own security policy.

Here is the possible flow of interactions:



Well, the above approach has a lot of issues to consider.
[Edited on May 20, 2010] Below each issue there are some hints I received from Lawrie Brown :
  1. How to verify caller credentials (probably the agent may consult the certification authority, like Verisign)?
    Lawrie: Check the certificate signature using the well known public key of the signing CA, and also contact the CA to check the Certificate Revocation List (CRL) to ensure that the certificate has not been cancelled for some reason.
  2. Which encryption techniques to use? Let's say we want to allow both Visa and Mastercard payment providers to get actual card information. But we do not know in advance when we configure the agent which exactly provider will be used by the web-shop. And the web-shop should itself not be able to decrypt this data even if it obtains it from the agent's code. If we use asymmetric cryptography we will need to store two copies of encrypted data for each of the providers. But if we use symmetric cryptography, we would probably not want to send the key within the agent.
    Lawrie: You would store ONE copy of the data, encrypted using a block or stream cipher with a temporary randonly created session key. And then store two copies of that key, separately encrypted using the providers respective public keys. This means the agent can send the encrypted data, along with the relevant public key encrypted session key, to the desired recipient. And no-one, looking at the data carried by the agent, would be able to recover the encrypted info apart from the two providers who are the only ones with suitable private keys needed to recover the session. This process can of course be extended to more than two providers.

  3. How to secure data within the agent from illegal obtaining by code inspection or disassembling?
    Lawrie: This to me is the really big issue, how does the sender know the agent has not been corrupted, or the actions/answer compromised? Thats the fundamental issue behind the use of agents, and I don't know the answer.
Despite the problems and immaturity of the agent-based systems, my prediction is that in some years we will use agents on a regular basis. Time will show... ;)

[Nec97] George C. Necula. Proof-carrying code. pages 106-119. ACM Press,
1997.

http://www.linkedin.com/in/oldbam

October 21, 2009

Security: Monitoring Your Geographical Log-in Location

When eliciting software requirements for the student project I have found the draft of the OWASP guidelines for identification and authorization here. One item in the list attracted my attention:

1.15 Concurrently, the system should perform velocity checking against the IP address of the last known valid log-in from that user so as to ensure the user is logging in from an IP address range originating from within that country of origin. It would be not be feasible to expect a user logging in from one geographic location and then from another within a specified amount of time, e.g., Correct credentials supplied from a user originating from the United Kingdom at GMT 10:00Hrs and a second log-in attempt from a user in Venezuela at GMT 10:27Hrs.


Earlier I have seen that Facebook performs a similar validation. When you are logging in from a different location, Facebook provides the following validation screen:



However, there might be problems with Facebook's approach, since it might be not difficult to discover the person's birthday (try a google search to find out your own birthday, chances are that you will be impressed ;).




http://www.linkedin.com/in/oldbam

February 18, 2009

GWT: Adding History Support

There is one very simple yet important rule for those who want to add History support to the GWT web-site:

If pressing on your UI element should result in changing the visible screen content, then the event which triggers by the UI element (i.e. onClick) should simply add new item to the History stack and exit.

If you wonder where would be the code that actually changes the screen content, I will answer, "In the HistoryListener#onHistoryChanged method." And this method will be triggered as soon as the history item will be created. If you adhere to the above rule, you will get the following benefits:
  1. You will provide the possibility to store bookmarks for specific state of your web-site
  2. You will not duplicate the logic that changes page content: all such code will go to the onHistoryChanged method

Links:
Introduction to the GWT History mechanism



http://www.linkedin.com/in/oldbam

December 14, 2008

How to write the testable code - The Clean Code Talks

The Google TechTalks video series has introduced a couple of new videos under the title "The Clean Code Talks". All of them are short and easy to understand talks presented by Miško Hevery. Here is the summary, which I took away from the "Don't look for things" and "Unit testing" videos:
  1. Object construction code should be separated from the business logic code to make the code testable.
  2. To satisfy the above objection one should use the Dependency Injection (with a preference to constructor injection over setter injection)
If anybody is interested in the topic and has no time to watch the full length video I would recommend the Martin Fowler's article on Dependency Injection, which basically contains all the important details on the topic.


http://www.linkedin.com/in/oldbam