Archive

Posts Tagged ‘Architecture’

Concurrency Patterns: Producer and Consumer

22 August, 2011 1 comment

In enterprise world, where performance holds the key to everything; the Concurrency patterns bring to table a very interesting and effective solution. One specific pattern Producer and Consumer allow us to write programs with high throughput and get the job done much quickly. This pattern provides us a solution for a common problem where we have to migrate data form System 1 to System 2 and in the process we need to do three tasks: Load data from Database based on groups, Process and Update the records back

You can read the complete post here on Scratchpad101.com (http://scratchpad101.com/2011/08/22/concurrency-pattern-producer-consumer/)

TestNG or JUnit

7 August, 2011 7 comments

For many years now, I have always found myself going back to TestNG whenever it comes to doing Unit Testing with Java Code. Everytime, I picked up TestNG, people have asked me why do I go over to TestNG especially with JUnit is provided by the default development environment like Eclipse or Maven. Continuing the same battle, yesterday I started to look into Spring’s testing support. It is also built on top of JUnit. However, in a few minutes of using the same, I was searching for a feature in JUnit that I have always found missing. TestNG provides Parameterized Testing using DataProviders. Given that I was once again asking myself a familiar question – TestNG or JUnit, I decided to document this so that next time I am sure which one and why.

Essentially the same

If you are just going to do some basic Unit Testing, both the frameworks are basically the same. Both the frameworks allow you to test the code in a quick and effective manner. They have had tool support in Eclipse and other IDE. They have also had support in the build frameworks like Ant and Maven. For starters JUnit has always been the choice because it was the first framework for Unit Testing and has always been available. Many people I talk about have not heard about TestNG till we talk about it.

Flexibility

Let us look at a very simple test case for each of the two.

package com.kapil.itrader;
import java.util.Arrays;
import java.util.List;
import junit.framework.Assert;
import org.junit.BeforeClass;
import org.junit.Test;

public class FibonacciTest
{
    private Integer input;
    private Integer expected;

    @BeforeClass
    public static void beforeClass()
    {
        // do some initialization
    }

    @Test
    public void FibonacciTest()
    {
        System.out.println("Input: " + input + ". Expected: " + expected);
        Assert.assertEquals(expected, Fibonacci.compute(input));
        assertEquals(expected, Fibonacci.compute(input));
    }
}

Well, this is example showcases I am using a version 4.x+ and am making use of annotations. Priori to release 4.0; JUnit did not support annotations and that was a major advantage that TestNG had over its competitor; but JUnit had quickly adapted. You can notice that JUnit also supports static imports and we can do away with more cumbersome code as in previous versions.

package com.kapil.framework.core;
import junit.framework.Assert;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.testng.annotations.BeforeSuite;
import org.testng.annotations.Test;

public class BaseTestCase
{
    protected static final ClassPathXmlApplicationContext context;

    static
    {
        context = new ClassPathXmlApplicationContext("rootTestContext.xml");
        context.registerShutdownHook();
    }

    @BeforeSuite
    private void beforeSetup()
    {
       // Do initialization
    }

    @Test
    public void testTrue()
    {
        Assert.assertTrue(false);
    }
}

A first look at the two code, would infer that both are pretty much the same. However, for those who have done enough unit testing, will agree with me that TestNG allows for more flexibility. JUnit requires me to declare my initialization method as static; and consequently anything that I will write in that method has to be static too. JUnit also requires me to have my initialization method as public; but TestNG does not. I can use best practices from OOP in my testing classes as well. TestNG also allows me to declare Test Suite, Groups, Methods and use annotations like @BeforeSuite, @BeforeMethod, @BeforeGroups in addition to @BeforeClass. This is very helpful when it comes to writing any level of integration testing or unit test cases that need to access common data sets.

Test Isolations and Dependency Testing

Junit is very effective when it comes to testing in isolation. It essentially means that there is you can not control the order of execution of tests. And, hence if you have two tests that you want to run in a specific order because of any kind of dependency, you can not do that using JUnit. However, TestNG allows you to do this very effectively. In Junit you can make workaround this problem, but it is not neat and that easy.

Parameter based Testing

A very powerful feature that TestNG offers is “Parameterized Testing”. JUnit has added some support for this in 4.5+ versions, but it is not as effective as TestNG. You may have worked with FIT you would know what I am talking about. However, the support added in JUnit is very basic and not that effective. I have modified my previous test case to include parameterized testing.

package com.kapil.itrader;

import static org.junit.Assert.assertEquals;

import java.util.Arrays;
import java.util.List;

import junit.framework.Assert;

import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.junit.runners.Parameterized.Parameters;

@RunWith(Parameterized.class)
public class FibonacciTest
{
    private Integer input;
    private Integer expected;

    @Parameters
    public static List data()
    {
        return Arrays.asList(new Integer[][] { { 0, 0 }, { 1, 1 }, { 2, 1 }, { 3, 2 }, { 4, 3 }, { 5, 5 }, { 6, 8 } });
    }

    @BeforeClass
    public static void beforeClass()
    {
        System.out.println("Before");
    }

    public FibonacciTest(Integer input, Integer expected)
    {
        this.input = input;
        this.expected = expected;
    }

    @Test
    public void FibonacciTest()
    {
        System.out.println("Input: " + input + ". Expected: " + expected);
        Assert.assertEquals(expected, Fibonacci.compute(input));
        assertEquals(expected, Fibonacci.compute(input));
    }

}

You will notice that I have used @RunWith annotation to allow my test case to be parameterized. In this case, the inline method – data() which has been annotated with @Parameters will be used to provide data to the class. However, the biggest issue is that the data is passed to class constructor. This allows me to code only logically bound test cases in this class. And, I will end up having multiple test cases for one service because all the various methods in the Service wil require different data sets. The good thing is that there are various open source frameworks which have extended this approach and added their own “RunWith” implementations to allow integration with external entities like CSV, HTML or Excel files.

TestNG provides this support out of the box. Not support for reading from CSV or external files, but from Data Providers.

package com.kapil.itrader.core.managers.admin;

import org.testng.Assert;
import org.testng.annotations.Test;

import com.uhc.simple.common.BaseTestCase;
import com.uhc.simple.core.admin.manager.ILookupManager;
import com.uhc.simple.core.admin.service.ILookupService;
import com.uhc.simple.dataprovider.admin.LookupValueDataProvider;
import com.uhc.simple.dto.admin.LookupValueRequest;
import com.uhc.simple.dto.admin.LookupValueResponse;

/**
 * Test cases to test {@link ILookupService}.
 */
public class LookupServiceTests extends BaseTestCase
{

    @Test(dataProvider = "LookupValueProvider", dataProviderClass = LookupValueDataProvider.class)
    public void testGetAllLookupValues(String row, LookupValueRequest request, LookupValueResponse expectedResponse)
    {
        ILookupManager manager = super.getLookupManager();
        LookupValueResponse actualResponse = manager.getLookupValues(request);
        Assert.assertEquals(actualResponse.getStatus(), expectedResponse.getStatus());
    }
}

The code snippet above showcases that I have used dataProvider as a value to the annotations and then I have provided a class which is responsible for creating the data that is supplied to the method at the time of invocation. Using this mechanism, I can easily write test cases and its data providers in a de-coupled fashion and use it very effectively.

Why I choose TestNG

For me the Parameterized Testing is the biggest reason why I choose TestNG over Junit. However, everything that I have listed above is the reason why I always want to spend a few minutes in setting up TestNG in a new Eclipse setup or maven project. TestNG is very useful when it comes to running big test suites. For a small project or a training exercise JUnit is fine; because anyone can start with it very quickly; but not for projects where we need 1000s of test cases and in most of those test cases you will have various scenarios to cover.

http://kapilvirenahuja.com/tech/2011/08/07/testng-or-junit/

‘The Null’ Nuisance

3 February, 2009 1 comment

While working on enahncements on a project already in production, I had a very interesting conversation. Let me give a brief background – the core architecture is all in place and we need to build in new functionality. Of course, refactoring is being done along the road. In a specific scenario, I got into a conversation with a fellow architect on usage of “nulls” and “null checks”. The theme of the conversation was “Should a method return a null or an initialized instance of the class”. Let me take an example:

There is a service method that connects to a database loading records for all users in the system. In the DAO we are loading the recordset from the database and converting to an ArrayList of DTO (ValueObject). A sample code to map the a DTO generally is:

List<User> users = null;
for(int index = 0; index < recordSet.size(); index++)
{
    User user = new User();
    user.setFirstName(recordSet.getString(“firstName”);
    user.setMiddleName(recordSet.getString(“middleName”);
    user.setLastName(recordSet.getString(“lastName”);    user.add(user);
}

return users;


 

 

 

 

I had an objection to this style of coding. The simple reason being, on the front-end, I had to put a check for null which was un-necessary. Hence, the other classes that were consuming the results had to write the following code:

List<Users> users = loadAll();
if(users != null)
{
    /// do something
}
else
{
    if(users.get(index).getMidleName() != null)
    {
        // show the middle name
    }
    else
    {
        // do not show the middle name
    }
}

Now, consider a scenario with complex objects having lists all down the hierarchy. It means that before we access a property, we will have to provide a null check. Soon, this “do nothing” null check will become a headache. Someone has coded a null propogation somewhere and we can not trace it. We feel the easiest way is to put in a null check. In my given example, I would have my JSP strewen with null checks cluttering my code.

Unfortunately, this will not solve the real problem. A simple solution is to identify the code where a null reference can be introduced and handle it there. The rest will be happy about it.

More importantly, et us pause for a minute and ask ourselves – Is there something that the application can do, with an object refering to nothing? Let us go back to my example and see how is the application going to use the user list. We need the list of the users to display a report for the users listed. If no users are returned, the uer should see “No users exist”. The UI is no sure, what represents  users – a null object or an initialized object with 0 size or an exception. This will mean that the developer consuming the method will have to write these multple conditions for a simple check.

We can do oe of the following: 

 

 

 

 

1. Throwing a business exception that voilates a business logic can be an effective strategy. However, it largely depends on how do you use exceptions in applications. Remember, raising an exception is an expensive operation.

2. Alternatively, you can provide an Empty implementation of the object that can do something useful like logging an info or an error to the log system.

I am not a hugh fan of throwing an Exception, and also because it is expensive, I am exploring the second option. This changes my code to:

List<User> users = null;
for(int index = 0; index < recordSet.size(); index++)
{
    User user = new User();
    user.setFirstName(recordSet.getString(“firstName”));
    if(
recordSet.getString(“middleName”) == null)   // You can also use StringUtils from apache.lang
    {
        user.setMiddleName(“”);
    }
    else
    {

        user.setMiddleName(recordSet.getString(“middleName”));
    }
    user.setLastName(recordSet.getString(“lastName”));    user.add(user);
}

if(users == null)
{
    // throw new business exception
}

// else we return an initialized list.
return new ArrayList<User>();

This will change the UI code to:

 

 

 

 

 

List<Users> users = loadAll();
// code to show the middle name – if it does not exist, it will show up as blank.


The most evident benefits is – “No more if statements for null checks on the UI. Check is being pushed down in the call hierarchy. Hence, multiple methods calling the same method will not have to worry about nulls.”

The most important question is “Is this approach safe?” Nothing ever is. There is no reason for someone to code incorrectly. Of course, we can not on external libraries never to return null references, but when you write your own code, following this approach can lead to a less cluttered application and a better control over source code.

Remember: The approach is not always necessary, just ensure that the null reference should not be catastrophic.

Extract, Transform, Load

14 January, 2009 2 comments

ETL in computing terminology refers to Extract, Transform and Load process. This is related mostly to data warehousing projects. A ETL framework involves the following three steps:

1. Extract: This is a process to load the data from a data source which could be a database, or a file dump from another system

2. Transform: This step involves, massaging the data to an appropriate form. This may need to to trim down the data or aggregate data from multiple data sources

3. Load: This is the final step, which uploads the data in another data source like database or generate a flat file.

Extract

This first part of the process involves reading data from various data sources databases. The data itself could be in different format. Some of the very commonly used data formats are databases and flat files. In some cases the data sources may also include some non-relational data sources.

Transform

This next step involves application of various rules on the dataset and prepare the data for the next step of Load. Some of the datasets may need very little or no transformation, while there may be other data sources that need very complext levels of transformations to meet the business requirements. Some of the common operations that may be needed here are:

  • Filtering the data set for a subset of records
  • Generating new values based on existing columns (using pre-defined formulas)
  • Splitting of data set into different tables
  • Aggregating data from various data sets

Load

The load phase loads the data in the target. This phase can do various things depending on the business needs. Sometimes the load may need uploading a fresh data set on a incremental basis. In other cases it may require to update an existing dataset

ETL Flow

1. Cycle Initiation: This is the very first step in the ETL process, where you collect all the reference data and validate that the settings provided are correct. This is the initialization phase. If there are errors during initialization, the ERL process fails.

2. Extract: In this step, you read the data from the datasource

3. Validate: Here the data is validated against a pre-defined business ruleset

4. Transform: Apply any transformation rules

5. Stage: This can be categorized a sub set of the transform stage. A business requirement may need us to load the data in a temp space like when we need to aggregate data from more than one data sources. In that case, we use a tamp database to hold the data sets before we can apply transformation.

6. Load: Load the data into final data source.

7. Cleanup: Clean up any temp files / databases.

Challenges

Some of the common challenges are:

  • An ETL process involved considerable complexity and significant problems can come up with an incorrect designed solution
  • Data sets in production can be vastly different than what developers of the system use. This can lead to huge performance bottlenecks
  • These types of solutions grow horizontally which involves adding more data sources either to extract or load. The solutions should be designed to support addition of such data sources with minimal effort

Performance

This is the biggest challenge that any ETL solution has to struggle with. Most often the slowest part of the ETL process is the load phase where we have to take care of the various database structures, integrity of the records and indexes. The transform phase can also lead to some performance bolltenecks if there are needs to perform some extensive data transformations.

Best Practices

Layered Architecture Design

Core Layer: This is the primary layer which holds all the business logic or core processing like Extract, Transform and Load

Job Management Layer: This layer should take care of scheduling jobs, managing queues and other operational activities like activation of tasks, alerts etc

Auditing and Error Handling: This layer should be dedicated to auditing process, logging entries to log files or database. Also, providing error handling support

Utilities: A common layer to provide common functionality across layers

Core Layer

This is the most important layer and holds the most logic. As a good practice, this layer should be divided into three sections, which should be controlled around a commoin Processing logic. Some common components of this layer can be:

1. Controller: These hold the processing logic which co-ordinate the entire ETL lifecycle. They hold the details of the various utilities and invoke them as needed.

2. Readers: These hold logic to read data from data sources like databases and flat files. Their responsibility should be to load the data set and make it available for next phase.

3. Transformer: These components hold the logic for applying transformations to the data. Transformations can be business validations, mappings or other logic.

4. Mappers: These hold the mapping for a transformation. The controller should be aware of the mappings that are to be applied to the loaded data. In most common cases, the framework should make interfaces available to the consumers of the framework to define mappings. The framework (via controller) should consume those mappings

5. Validation: If there are validations needed to applied, these components should be defined individually. Again, in most cases, the framework should make these available as interfaces and concrete implementations would be provided by consumers of the framework.

6. Loaders: These hold logic to load the data into a data source like databases and flat files.

I am not a Subject matter expert on ETL, but hope this helps.

Categories: Architecture, Design Tags: , ,

What should I do?

Has this question ever crossed your mind? With evolving technologies and framework this is a question I ask myself all the time especially when I have a new project on its outset.

A few weeks back, I started to work on a architecture definition that holds true in most cases. Of course this is going to be on J2EE platform. With this series, I will try to provide my understanding of various available frameworks and comparisons for existing technologies. This series will be in an attempt for me to define a technlogy stack and a taxonomy that I can re-use across projects hoping to get a productivity boost. I am not saying that the stack will not evolve. What I am trying to say is “Be ready!”

Cheers

Need for 3-tier Architecture

Last week, I was working to define an architecture for an existing application. When I walked into the room with the prposal the Senior Delivery Manager asked me “Why do we need an architecture? Why can not not use what we already have?” His concern was logical, this shift was going to push his behind schedule. While I spent next 20 minutes explaining him the importance and need of a 3-tier architecture, it dawned upon me that i have done this several times. Only if i have this documented on paper it would save me lot of time.

What is a Layer?

A layer is referred to a logical separation of code. In J2EE world this is referred to generally an independent Java project that holds the logic. A layer is responsible for speaking to other layers in the application providing or extracting information. An example is the Presentation Layer that is responsible for showing data to the user but it has the responsibility of extracting the information from various other layers.

Two-Tier Architecture

A two-tier architecture is represented when all the code for extracting data from the database and presentation logic i.e. show data to the user resides in the web layer itself. Some definitive advantage of this approach is that it is handy and provides rapid development. However, this approach has some obvious dis-advantages:

  • Putting all the code in the web layer makes if difficult to maintain. 80% of the time of the application life cycle is spent during maintenance and support. Having unmanageable code only makes matters worse
  • Code reuse is not possible. Many a times with changing needs, organization decide to change the application front-end of the presentation. At times, they decide to add some other add-ons. With code sitting on the web layer makes this impossible. Hence, the application can not be scaled
  • Relying on data source (JDBC) controls makes things more complex.

How do we solve this problem is by introducing a 3-tier architecture which abstracts the code based on logical groupings i.e. Data Access, Business Logic and Presentation Logic. This could be a slow process to start with, but has many advantages in the long run.

Hope this helps.  Soon, I will post about the 3-tier architecture and talk about its benefits.

What is Flex?

This should have been the first of my posts. Someone asked this question to me and there were answers but nothing that would explain in depth. There you go…

The Adobe Engagement Platform architecture

Adobe engagement platform architecture

Universal client technology

By combining the strengths of ubiquitous Flash Player with Adobe Reader® software, HTML, and JavaScript, developers can deliver a predictable, high-quality application experience across browsers, desktops, and devices.

Programming model

The Flex development model (MXML and ActionScript) plays a central role in the platform. By providing a versatile and robust programming model, Flex enables organizations to efficiently deliver RIAs that take advantage of the universal client technology.

Development and design tools

With products like Adobe Photoshop®, Dreamweaver®, Flash Professional, and Illustrator®, Adobe is a recognized leader in the creative tools market. Through integration with Flex Builder and third-party development tools, Adobe is enabling designers and developers to work together to deliver more engaging experiences.

Server framework

Adobe server technologies build on existing infrastructure standards like Java EE and .NET, while providing services that simplify integration and extend the capabilities available to rich clients. Beyond the services provided by Flex Data Services, Flash Media Server enables applications to integrate two-way audio and video streaming, while Adobe LiveCycle® software provides services for business process management, document generation, and
information assurance.

The goal of the Adobe Engagement Platform is to blend the strengths for the Adobe technologies and open source standards to provide a versatile foundation.

Flex Product line

1. Flex Software Development Kit (SDK) – The core component library, development languages and compiler for Flex applications. This is open source and

2. Flex Builder IDE – An Eclipse based development environment that provides tools like code editors, visual layout tools, project management tools and an integrated debugger

3. Flex Data Services – Code named BlazeDS, is a Java server-based application that enables high-performance data transfer

4. Flex Charting – A library of extensible charting components that enable rapid construction of data visualization

Flex prouct line

Flex runtime architecture

The Flex runtime architecture is closely aligned with the just-in-time deployment model of web applications. The client portion of a Flex application is deployed as a binary file that contains the compiled bytecode for the application. Users then deploy this file to a web server just as they would an HTML file or an image. When the file is requested by a browser, it is downloaded and the bytecode is executed by the Flash Player runtime.

As illustrated in Figure below, once started, the application can request additional data or content over the network via standard HTTP calls (sometimes referred to as REST services) or through web services (SOAP). Flex clients are server agnostic and can be used in conjunction with any server environment, including standard web servers and common server scripting environments such as JavaServer Pages (JSP), Active Server Pages (ASP), ASP.NET, PHP, and ColdFusion®.

Flex runtime architecture

Flex development model and application framework

The development process for Flex applications mirrors the process for Java, C#, C++, or other
traditional client development languages. Developers write MXML and ActionScript source code
using the Flex Builder IDE or a standard text editor. As shown in Figure 4, the source code is then
compiled into byte-code by the Flex compiler, resulting in a binary file with the *.swf extension.

Flex framework

The MXML markup language

Like HTML, MXML is a markup language that describes user interfaces that expose content and functionality. Unlike HTML, MXML provides declarative abstractions for client-tier logic and bindings between the user interface and application data. MXML helps maximize developer productivity and application re usability by cleanly separating presentation and business logic.

The following code listing uses MXML to define the user interface for a login form. This example uses some very basic controls

<?xml version=”1.0″ encoding=”utf-8″?>
<mx:Application xmlns:mx=”http://www.adobe.com/2006/mxml” layout=”absolute”>

<mx:Label x=”47″ y=”19″ text=”Username:”/>
<mx:Label x=”47″ y=”61″ text=”Password:”/>
<mx:TextInput x=”170″ y=”17″/>
<mx:TextInput x=”170″ y=”59″/>
<mx:Button x=”112″ y=”115″ label=”Cancel”/>
<mx:Button x=”218″ y=”115″ label=”Login”/>

</mx:Application>

Sample Login page

ActionScript 3.0
ActionScript is the object-oriented programming language used for Flex development. Like JavaScript, ActionScript 3.0 is an implementation of ECMAScript, the international standardized programming language for scripting. However, because it is an implementation of the latest ECMAScript proposal, ActionScript provides many capabilities not common in the versions of JavaScript supported by most browsers. At development time, ActionScript 3.0 adds support for strong typing, interfaces, delegation, namespaces, error handling, and ECMAScript for XML (E4X).
At runtime, the most significant difference between JavaScript and ActionScript is that ActionScript is just-in-time compiled to native machine code by Flash Player. As a result, it can provide much higher performance and more efficient memory management than interpreted JavaScript. Flex developers use ActionScript to write client-side logic, such as event listeners and call-back functions, or to define custom types for the client application. For example, the following code shows the definition of the Customer class.

Flex Data Services

Flex Data Services extends the capabilities of the Flex client framework by providing additional services for managing data transfer and integrating with existing applications and infrastructure.

As illustrated below, Flex Data Services fits into an organization’s existing deployment environment. It is implemented as a Java web application and can be deployed on standard Java application servers, including IBM WebSphere, BEA WebLogic, Adobe JRun, JBoss, Tomcat, and others.

Flex Data Services

Figure shows a high-level overview of the services provided by Flex Data Services. When working with Flex Data Services, developers define a set of “destinations” using XML configuration files. These definitions are used by the built-in service adapters provided as part of the Flex Data Services application. These include low-level adapters to connect to Java objects (data access objects), JMS topics/queues, or ColdFusion components (CFCs) as well as higher level adapters for common persistence solutions such as Hibernate, Enterprise JavaBeans (EJB), and Spring. The Flex Data Services adapter architecture is open and customizable, allowing connectivity to any back-end
data system or application.

Flex Data Services capabilities

Additional resources

About

  • http://labs.adobe.com/technologies/flex/
  • http://www.adobe.com/products/flex/

Development resources

  • Flex development center: http://www.adobe.com/devnet/flex/
  • Data services: http://www.adobe.com/products/livecycle/dataservices/
  • Flex team blog: http://weblogs.macromedia.com/flexteam/
  • Live Docs: http://livedocs.adobe.com/labs/flex3/html/
  • Language reference: http://livedocs.adobe.com/labs/flex3/langref/
Categories: Adobe, Beginner, Flex Tags: ,
Follow

Get every new post delivered to your Inbox.