We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Open for Business

Noam Tamarkin

Optimizing Load in a Large Batch Process

user-pic
Vote 0 Votes
Hi all,

I want to provide an example that will comprise my two last posts: "Thinking Inside the Box but Out of the Context" and "Resource optimization for the Virtual infrastructure".

In a large enterprise system or ERP, there are normally some batch jobs that you need to run, daily, weekly, monthly or on occasion.

These batch runs could handle millions of records in a single batch. It also common that the records are not dependent on each other e.g. customer details update from HQ. You probably know that each record could take different time to process e.g. one customer has multiple accounts in our organization while the other is just historical data.

There are many methods to optimize the process. Most of these methods are based on some kind of record distribution to several servers that perform the same process. For example, you break the million records into smaller batches of 10,000, than, send each 10,000 to a different server. There are two major issues with this solution:

  1. The breaking into 10,000 is arbitrary so you cannot predict the processing time
  2. You have to develop a control software to control the start and end of each 10,000.

So, what do I propose?

Still within the Box but out of the normal context, this is my proposal.

You prepare servers that can perform the process. You can use servers that are used for other purposes during the day. Then, expose a web services that gets a single record to process. In front of the servers you place a standard web load balancer.

Now, you run on the million records, one by one and call the service with each record.

The load balancing will do the optimization for you and you do not need to control anything beside the start and end of the million records.

This is the architecture:

Optimizing_load_of_batch.gif

Let me know if you used it!

Noam


Leave a comment

In this blog, Noam Tamarkin provides ideas for improving and better integrating your applications.

Noam Tamarkin

Senior software architect and CTO. Experience in solution design and implementation. Holds the ability to understand complex business processes and translate them to technology. Expert in Enterprise applications, integration, SOA, SaaS. Experienced in project management, technical infrastructure, procurement and manufacturing.

Subscribe

Recently Commented On

Monthly Archives

Blogs

ADVERTISEMENT