Open for Business

Noam Tamarkin

Optimizing Load in a Large Batch Process

user-pic
Vote 0 Votes
Hi all,

I want to provide an example that will comprise my two last posts: "Thinking Inside the Box but Out of the Context" and "Resource optimization for the Virtual infrastructure".

In a large enterprise system or ERP, there are normally some batch jobs that you need to run, daily, weekly, monthly or on occasion.

These batch runs could handle millions of records in a single batch. It also common that the records are not dependent on each other e.g. customer details update from HQ. You probably know that each record could take different time to process e.g. one customer has multiple accounts in our organization while the other is just historical data.

There are many methods to optimize the process. Most of these methods are based on some kind of record distribution to several servers that perform the same process. For example, you break the million records into smaller batches of 10,000, than, send each 10,000 to a different server. There are two major issues with this solution:

  1. The breaking into 10,000 is arbitrary so you cannot predict the processing time
  2. You have to develop a control software to control the start and end of each 10,000.

So, what do I propose?

Still within the Box but out of the normal context, this is my proposal.

You prepare servers that can perform the process. You can use servers that are used for other purposes during the day. Then, expose a web services that gets a single record to process. In front of the servers you place a standard web load balancer.

Now, you run on the million records, one by one and call the service with each record.

The load balancing will do the optimization for you and you do not need to control anything beside the start and end of the million records.

This is the architecture:

Optimizing_load_of_batch.gif

Let me know if you used it!

Noam


Leave a comment

In this blog, Noam Tamarkin provides ideas for improving and better integrating your applications.

Noam Tamarkin

Noam Tamarkin is a senior software consultant, architect and experienced development manager. View more

Recently Commented On

Tag Cloud

adaptation, analytics, appliance, Application, application, architect, assets, Batch run, Best of breed, best practice, Bus, business, CIS, client, Cloud, cloud, coding, Coghead, complex architecture, complex design, compliance, context, CRM, data, DRP, e-book, EDI, effective, efficient, email, engine, engineering, ERP, ESB, Facebook, financial, framework, FTP, future, go-live, Google Wave, governance, HaaS, hacker, IDE, idea, ideal, IDOC, imaginary, improve maintainability, increase maintainability, industry, Inetgration, infrastructure, innovation, insurance, Integration, integration, Introduction, Intuit, IT, Job, Large, large enterprise, layer, Legacy application, legal, liability, Linux, load, machine, maintainability, Manifesto, manufacturing, Marketing, maturity, message, metadata, Microsoft, middleware, migration, Model, mom and pop, MySpace, New technology, Noam Tamarkin, On-Demand, on-premise, open, Operating system, optimization, optimizing, Oracle, organization culture, Out of the Box, outage, owner, PaaS, physical server, process, Protocol, proxy, QuickBooks, Red Hat, reduce, regulation, reliability, request, Resource, response, REST, RFC, risk, risks, ROI, SaaS, sales, Sales, Salesforce, SAP, security, service, Service, service design, service logic, Service provider, Service technology, service terms, services, SLA, small business, SMB, SME, SOA, SOAP, Social network, software, Software, Software as a Service, software cost, standard, Sun, survival, synchronize, TCO, technology, tenant, Thinking inside the box, thinking out of the box, toolkit, Total Cost of Ownership, train, Twitter, users, Users, Virtual, virtual server, Visual Studio, Web Methods, Windows, Within the box, within the Box, workfolw,

Monthly Archives

Blogs

ADVERTISEMENT