A common flow assurance study objective is calculating the time required to reach hydrate conditions during a shutdown (typically from steady state). This is often referred to as ‘Cooldown’. The ability for an operator to understand the amount of time that is available to safely complete various tasks during a shutdown, or how close to the hydrate region they can operate a pipeline before running into problems is incredibly valuable.
This discussion will cover how one can run large simulation matrices for cooldown efficiently with flotools and will compare the flotools methodology with traditional methods using the OLGA GUI.
In flotools, you can setup and create large numbers of cases for a project, and post-process the results quickly. For this case, we are looking for the cooldown times to the hydrate formation temperature for a system with varying parameters: Gas Lift Rate, Flowrate, Inlet Temperature, Water cut, and different fluids. flotools makes this process very simple and easy using the Parametric Studies tool.
With a large case matrix, it can be time-consuming to create all the varying simulations in OLGA. The flotools Parametric Studies tool can create model files by replacing the text of a base file with new parameters. For example, if you have a boundary node with PRESSURE = 100 psig you can give flotools a list of comma-separated values (100, 150, 200) and flotools will create 3 separate key files that are copies of the base file, one with 100 psig, one with 150 psig, and one with 200 psig at that node.
The only inputs needed are the variables to be varied, the parameters for each variable are defined only once. Cases can be named interactively, based on the parameters used.
The naming of cases is interactive with the inputs listed above. Each variable has an index associated with it, e.g., WC is associated with %7 (seventh study variable). The index associated with each variable can then be input back into the naming line with the %n operator to have them change with each variation of the parameters. You can even create linked variables that are not used for anything except for naming; for example, the 2nd (%2) and 6th (%6) variables above. Also, the total number of cases to be generated is indicated on the generate cases button with a small green number badge, 360 in this example. This is how many cases we will be generating with these inputs, coming from:
5 flowrates x 3 gas lift rates x 4 temperatures x 2 fluids x 3 water cuts = 360 cases
Once the matrix is set and named appropriately for the parameters of our study, flotools will create the 360 .key files in a location of our choice. flotools can output the .key/model files to any location in the directory structure as well as generate a batch file if necessary. The cases are then ready to run.
Due to the large number of cases, post-processing of the results was time consuming. When using the OLGA GUI to extract simulation results for many cases you are limited by the number of cases that can be load into the GUI simultaneously. For this project, each case output file was relatively small (.tpl and .ppl file size combined to about 2 MB for each case). We were only able to load 47 of the fully completed simulations into the GUI at one time. When extracting trend and profile data you can only take data from those cases loaded into the GUI at a given time. This project required 8 separate extractions to get the full data set, which consumed a lot of extra time to load cases into the OLGA GUI.
Whereas with the flotools route, you can load the cases into a flotools workspace and be ready for plotting immediately.
Both methods are the setup for plotting and tabulating results. In Excel, to create a plot of cooldown times requires some combination of MATCH and INDEX/OFFSET formulas for each series of the MDTHYD output in the case matrix. Following this, the rest of the data must then be formatted in the right way to form a table/pivot table that you can then use to plot based on different parameters.
In flotools, there are built-in calculations for DTHYD and its (MDTHYD, MDPHYD, MAXDTHYD, MAXDPHYD), which can be used without even having to input a hydrate curve in OLGA. Hydrate curves can be input in flotools during post-processing, which flotools will then reference in comparison to the temperatures and pressures over the entire flow path during the simulation.
A comparison of the methods described above are illustrated in the following figure:
Even for large case matrices, creating all the model files to run is incredibly easy and fast with flotools. Without pre-processing the entire set of cases was created in less than 2 minutes in this instance.
Conversely, post-processing using the OLGA GUI and Excel took roughly 5 times as long to create a finished product of plots or tables. This is mainly due to limitations of the OLGA GUI as to how much info can be stored at one time, i.e., how many cases it can load simultaneously. Having to export the data in chunks was a time-consuming part of the process.
For this set of 360 cases, the goal was to create 11 plots that show the effects of each set of parameters of the parametric study. An example plot created in flotools is show below.
Time breakdowns of each method are given below (these times reflect an experienced user of each method):
Open cases in GUI and export MDTHYD to .csv (22 minutes)
Find first MDTHYD > 0 °C (2 minutes)
Format all parameters and cooldown times in a table format (5 minutes)
Create pivot table, slicers, and pivot charts (7 minutes)
Format plots (create titles, axis labels, etc.) (20 minutes for 11 plots)
Total (56 minutes)
Open flotools and load cases (5 minutes)
Open parametric study tool and select variables and filters/slicers. (2 minutes)
Change plot formatting to liking (2 minutes)
Duplicate plot and change filters/slicers (2 minutes for 11 plots)
Total (11 minutes)
This time difference is especially impactful if the case matrix is updated or if simulations are re-run. If a similar parametric study needs to be run, you can simply copy the parametric study in flotools and modify it instead of having to make a completely new one, whereby saving time on future work.
flotools makes running common flow assurance tasks such as cooldowns easy to create, run and report, even if with many cases. There is a significant time savings with flotools as compared to using the OLGA GUI because flotools can handle all the cases from a project at once and can plot them instantly rather than requiring post-processing in Excel.
Piping and pipeline design engineers are familiar with the dangers of pressure surges (commonly referred to as “water hammer”) in liquid-filled pipes/pipelines. Pressure surges can be caused by transient events, such as valve closures and pump trips, that cause a pressure wave to move through the piping system at a high speed. The magnitude and speed of the pressure wave are determined primarily by the nature of the transient event (for instance, how fast the valve closes) and the speed of sound of the fluid, but also depend on the pipe material, diameter, wall thickness, and conditions of whether/how the pipe is constrained from moving. Pressure surges can burst the pipe or produce a vacuum, which may lead to collapse of the pipe.
A problem associated with pressure surges, transient piping loads, is also important, but less well understood by most engineers. This discussion examines the fundamentals of transient piping load analysis, the tools that are used to perform the analysis, and how evoleap can add value with this type of analysis.
Consider pipe segment that contains a straight horizontal section, with two vertical sections; each with a 90° bend connecting the vertical sections to the horizontal section as shown in Figure 1. The fluid exerts four forces on the pipe: gravity, friction, the downstream pressure force (F2), and upstream pressure force (F1). At steady-state conditions the pipe is static, so the sum of the forces in the horizontal and vertical directions must be zero. In the horizontal direction the friction force (Ffric) balances the difference in the pressure forces (F1 and F2). In the vertical direction the gravitational force (Fgrav) is balanced by upward forces exerted by pipe supports and/or the pipe segments at the bends.
In almost all cases, the frictional force, as we will show, is small and can be ignored. Also, the gravity force, provided the pipe remains full of liquid, is effectively constant, so it is irrelevant to a transient analysis. The pressure forces are equal to the product of the pressure at the specified location and the inner pipe wall surface area. With 90° bends, the direction of the force is along the pipe axis of the pipe segment. The sum of the forces in the horizontal direction is given in the equations below.
When a valve is closed downstream of the segment, the behavior is different. The valve closure will generate a pressure surge that will propagate backwards through the piping system. At some point, shortly after the valve is fully closed, the pressure wave will reach the pipe segment of interest as illustrated in Figure 2. At that time, the pressure at the downstream bend is much greater than the pressure at the upstream end, resulting in a large, net force in the direction of flow, which can cause damage to the pipe and/or supports or cause the pipe to jump off its supports. For a long pipe segment (~1000 ft), the pressure wave will traverse the entire segment in roughly 0.3 s.
Synergi Pipeline Simulator (SPS) is the industry-leading, transient, single-phase, pipeline simulator and is well-suited for this type of problem. In plants and terminals there are hundreds or thousands of pipe segments. Calculating and extracting the piping loads for different transient cases is a tedious, but important task. This has led evoleap to develop automated techniques for setting up, calculating, and extracting the piping loads from transient simulations for complex piping systems.
The piping load calculations are done via SPS’s application development language, intran, using macros. Macros can be thought of as subroutines that are invoked via a single statement that passes the necessary arguments. Thus, one macro can be used to calculate all the pipe forces where the macro is invoked for each pipe segment. The code to invoke the macros is generated automatically using Excel. SPS calculates the piping load for each segment at every time step in the simulation. The results, piping loads as a function of time and extreme piping loads, are extracted and analyzed automatically with evoleap’s inhouse tools.
We take as an example, the flow of Liquified Natural Gas (LNG) through a 30”, horizontal pipe segment 790 ft long, bounded by 90° bends upstream and downstream initially flowing at a standard loading/unloading rate for LNG. These are typical pipe sizes and segment lengths for LNG loading and receiving terminals. LNG loading and unloading terminals must be particularly cognizant of pressure surges and the resulting piping loads because they operate at high flow velocities, typically at 15 ft/s or greater and have fast valve closures during Emergency Shutdowns (ESDs).
The pressure profile at steady state through the pipe segment is shown in Figure 3 where the left axis (~80 ft) is at the upstream 90° bend. The downstream 90° bend is at the right axis at ~870 ft. Since the pipe is horizontal the difference in pressure across the pipe segment is accounted for entirely by friction. The pressure drop is only 2.9 psi, so the frictional force is obviously small. For instance, the force on each end is P*A = 68.8 psi * 672 in2 = 46.2 kips whereas the frictional force is (P2 – P1) * A = 2.9 psi * 672 in2 = 1.9 kips. In a surge event the pressure forces will be significantly larger (higher pressures) and frictional force will be smaller because the flow will approach zero. In fact, when the pressure wave reaches the segment, the flow rate, and the frictional force, will be very close to zero.
At time = 5 s, an ESD is initiated, which closes the ESD valves that isolate the loading system from the loading arms/LNG carrier. The ESD valves close in 18 s, which is within the typical range for LNG loading/unloading systems. The resulting pressures upstream and downstream of the pipe segment are shown in Figure 4. Very little happens as the ESD valves are initially closing. As the ESD valves near closure, pressures rise steadily due to line packing. The valves are fully closed at 23 s. In the last second prior to closure the pressure rises rapidly reaching a maximum value of 310 psig.
Figure 5 shows the same data but zoomed in on the time when the rapid pressure rises and maximum pressures occur. At 22 s, the upstream and downstream pressures are 153.5 and 183.6 psig, respectively, giving a net force of (183.6 – 154.5) * 672 = 19.6 kips exerted in the upstream direction. Such calculations are performed at every time step for every pipe segment across the entire simulation. The results can be used to compare with rated limits to determine fitness for service. In this case, the Maximum Surge Pressure (MASP) is 440 psig, which is equivalent to 296 kips. Also, the raw results can be exported and used in standard piping analysis software, such as Caesar II. In general, if piping loads exceed limits, possible mitigations are:
Changes in the closing characteristics of the valves (e.g. faster initially and slower at the end),
Changes in the valve closure times,
Reduction of the maximum allowable flow rate,
Addition of surge relief,
Addition of surge tanks,
Shorten segments (e.g. expansion loops),
Bypasses that divert flow.
Pressure profiles across the pipe segment at 1 s intervals are shown in Figure 6. The piping load at any time is directly proportional to the difference in pressure across the segment. The maximum piping load depends on the sharpness of the pressure wave (steeper slope), the length of the pipe segment (bigger pressure difference), and the area of the pipe (larger pipes will experience larger loads).
More complicated geometries also require consideration. For example, Figure 7 shows a common configuration with a pair of 45° bends. At a 45° bend, the pressure force is directed at a 45° angle to the pipe axis. Thus, there is a net force, even under steady-state conditions, on the segments 1-2 and 3-4. Under steady state conditions, the sum of forces F1, F2, F3 and F4 are effectively zero. However, as the forces are offset, there is a moment imposed on the system which tends to make the piping system straighten out. This is the same effect that makes a garden hose straighten out when it is pressurized.
Other situations that require special consideration are pipe segments with check valves or valves that may be closed or are actuated in the transient event and pipes with a change in diameter.
For further information, please contact Trent Brown at email@example.com.
This blog post will describe how flotools’ calculation feature was used to calculate the hydrate formation temperature as a function of time and location for a methanol-inhibited system during a cold-well start up. The analysis demonstrated that a less conservative use of hydrate inhibitor was acceptable. The well and flowline were successfully started up based on this analysis. This example is an advanced use of the flotool calculation feature.
The scenario is a well and flowline where the well is initially shut-in; the flowline is filled with a seawater/methanol mixture (15% methanol by volume); and a methanol pill has been placed just downstream of the wellhead choke. The choke will be opened, and the hydrocarbons will push the methanol pill and methanol-seawater mixture out of the flowline. Unfortunately, due to the pipeline profile and the different densities of the hydrocarbon liquid, gas, and aqueous phases, the displacement of the methanol-seawater mixture will not be clean. The hydrocarbons can potentially flow past the methanol pill, contact the methanol-seawater mixture, and form hydrates if the temperature/pressures conditions are suitable.
The goals are to ensure that the flowline will not be susceptible to hydrate formation conditions at any point in time or location during the startup sequence and to minimize the size of the methanol pill. This analysis is a combination of 6 calculations: 5 simple ones, and a final complicated calculation that does the heavy lifting.
First, let us go through the supplementary calculations:
The first calculation calculates the water volume in each pipe section. (The density must be calculated instead of using the OLGA water density variable because methanol is considered to be included in the aqueous phase in OLGA)
The second calculation determines the methanol volume in each pipe section.
The third calculation uses the two previous calculations to determine the percent methanol in the water in each pipe section.
Note that the three calculations shown above could be done in a single calculation, but to be able to output the water and methanol volumes separately as their own variable they are required to be standalone calculations.
Then, because hydrates can only form under specific conditions, the fourth calculation checks for:
The existence of hydrocarbon in each section.
Whether water is present in each section.
Whether the methanol concentration in the water is sufficiently high that hydrates cannot form at any practical temperature/pressure conditions in the flowline. This is a user specified threshold that can be manually changed. For this case the threshold is 50% methanol in the aqueous phase.
Now to the fifth calculation, which is the most complicated and calculates the DTHYD variable. DTHYD is a flotools variable and is defined as the ΔT between the fluid temperature and the hydrate formation temperature at the pressure at a given location and time. The most difficult of the calculations so far, this one includes the use of lookup tables, interpolations, and the above calculated variables.
The process order inside the flotools calculation module is shown below:
The hydrate curves for differing levels of methanol are generated from a 3rd party program, in this case MULTIFLASH, and imported into flotools.
The pressure inside each pipe section is identified.
The corresponding hydrate formation temperature at the pipe section pressure for each of the hydrate curves, representing different methanol concentration, that have been imported is calculated (6 curves for this example as shown in Figure 1).
Using the methanol concentration variable already calculated for each section, the two curves that bound the methanol concentration are determined. That is, if the methanol concentration is 22% then the calculation should use the 20% and 30% hydrate curves.
The final hydrate formation temperature is calculated by interpolating between the values that we looked up in Step 3 for the two bounding curves found in Step 4. This gives the hydrate formation temperature at the specified pressure and methanol concentration.
Subtract the actual temperature in each section from the interpolated hydrate formation temperature in each section at every time the profile was output.
This fifth calculation would be much simpler if there were only two hydrate formation curves to interpolate between, but because we are using a fine grid for the methanol content, the difficulty comes in determining which two curves to interpolate between in the flotools calculation. The selection of the correct two curves is done by using the methanol concentration calculated in the third calculation and a series of nested IF statements.
The nested IF statements work somewhat like a stepladder, it first checks whether the methanol concentration is above 0% and if so checks if it is above 10%, then 20%…etc. Once the IF statement finds a value that it is not above, it exits the nested IF and declares that that value is the upper bound. i.e., if the actual percent inhibition is 37%, once it realizes that it is not greater than 40% it then decides that 40% is the upper bound of the two curves. To identify the lower bound we first check if the upper bound is greater than 0 and if it is, we subtract 10% from the upper bound to calculate the lower bound. This is easy because the hydrate curves we generated were generated in uniform 10% increments.
If the methanol increments were not uniform another set of nested IF statements would be needed. The IF statements would follow the same stepladder approach but once a value was found to be above the actual methanol concentration, the IF statement would output the previous value, i.e., from our 37% example above, the IF statements would need to output next to the last value as well (i.e., the value before 40% which would be 30%).
The sixth calculation is the final output of the entire analysis. The sixth calculation does a final check to see if all three of the above conditions are met (the conditions in the fourth calculation), and if so, outputs the DTHYD results from the interpolation, otherwise it outputs a null number that can be manually changed (in this case -50).
We can see the actual application of the calculation in a case described below. This case has a methanol pill placed at the beginning of a flowline with a well shut in upstream. The rest of the flowline is saturated with a seawater-methanol mixture of 15% methanol by volume (Figure 2).
The well is then opened, and the hydrocarbons begin to push the methanol pill out of the flowline. The methanol pill disperses slightly and some gas that had accumulated at the top of the well during the shut-in squeezes past the methanol pill. The gas has the potential to form hydrates as the three conditions in the fourth calculation are met when gas moves in front of the methanol pill.
Figure 3 is quite complicated, but it very nicely illustrates exactly where hydrates can form and the ΔT between the fluid temperature and the hydrate formation temperature (Note: when DTHYD > 0 hydrates can form).
The red curve is the methane fraction as a function of distance. Methane is used as a proxy to determine whether hydrocarbons are present.
The blue curve is the methanol concentration (volume %) as a function of distance.
The green curve is DTHYD = fluid temperature minus hydrate formation temperature.
Three distinct regions are evident on Figure 3. Two of the regions are not at risk for hydrate formation. On the right-hand side (shaded) of the graph there is a steep drop-off of the Green DTHYD curve because there are no hydrocarbons in this region as shown by the red, methane-fraction curve.
The left-hand side (shaded) is a trickier region as there are two constraints to hydrate formation:
Either there is no seawater left (it has all been pushed out)
Or the hydrocarbon fluid is more than 50% inhibited with methanol (our specified criterion for full hydrate inhibition). Note that on the left-hand side, there could be produced water but for this example the produced water was sufficiently salty to inhibit hydrate formation.
In the middle (unshaded), hydrocarbons are present, and the methanol concentration is below our threshold of 50%. To determine if hydrate formation is possible, we must look at the DTHYD variable. In this region, the maximum value of DTHYD is -4 °F, indicating that hydrate formation is not possible. Therefore, the methanol pill size is acceptable for preventing hydrates.
We came across an interesting observation while designing surge volume calculations in flotools that I thought was worth sharing and inviting comments from the flow assurance community. For those that are not 100% sure what surge volume is exactly, let me first explain.
Calculating surge volumes is a routine part of a flow assurance engineer’s work. Operational scenarios like slugging, pigging, production ramp-up in multiphase production systems can all result in large volumes of liquids being swept out of the pipeline and into the first vessel on the receiving facility. Often, these liquid surges come in at rates that far exceed the receiving facility’s capacity to process liquids. Therefore the vessel, typically a slug catcher, acts as a buffer where the surge of liquid can be collected and processed over time. One of the objectives of performing flow assurance studies is to quantify the maximum surge of liquid that can be seen across various operations in order to size the slug catcher appropriately. The maximum volume that a slug catcher will have to hold for a given operation is called the surge volume.
OLGA provides a way to calculate surge volumes whenever at least one of ACCLIQ, ACCOIQ, and ACCWAQ is included in the list of trended outputs. The calculation assumes that the slug catcher is present just downstream of the location where these variables are trended and that the vessel can be drained at a fixed maximum drain rate during the operation.
The calculation performed by OLGA is described by the following equation:   
is the OLGA reported cumulative volume of liquid at time step ,
is the elapsed simulation time at time step ,
is the maximum drain rate of the slug catcher,
and mark the time window in the simulation where the calculation is done, and
is the calculated surge volume
In this post I will look at two interesting properties of this calculation:
Why were the accumulation variables (ACC*) used instead of the instantaneous rate variables (QL*)?
Why is there a operation in equation (2)?
Accumulation Variables vs. Instantaneous Rate Variables
OLGA’s calculation of surge volume uses the accumulated variables as the basis of the surge volume calculation instead of the instantaneous liquid volume rate variables (QLT, QLTHL, QLTWT). To understand why, let’s look at the instantaneous rate form of the surge volume equation.
When using the instantaneous rate variables, equation (2) becomes, 
If we compare the accumulation terms from (2) and (4), we see the following assumed relationship: 
In other words, the average of the instantaneous rates in a particular time window is approximately equal to the average accumulation rate in that time window. This is typically a bad assumption because the instantaneous rates capture rate spikes that are very short in duration and would not be indicative of the average rate for the corresponding time window. The average rate can be calculated as follows: 
The following chart shows a comparison between an actual QLT output from OLGA and the associated average QLT calculated from ACCLIQ according to equation (6):
You can see that the average QLT (from ACCLIQ) does not show the flowrate spikes that the QLT variable shows. These spikes, while they probably do occur in a flowing system, typically occur in very short time windows smaller than the output interval of the simulation. The larger the output interval, the worse the assumption.
The following chart shows an example of the error in accumulation by comparing the calculated accumulation using the rate variable and subtracting the OLGA calculated ACC variable from it. While the maximum error in this example (~25 barrels) is not significant, the magnitude of the error entirely depends on the nature of the simulation and may be significant in some cases.
In our view, OLGA has taken the correct approach and used the accumulated variables as the basis for the surge volume calculation.
Handling Negative Terms
Equation (2) features a operation. This ensures that the calculated volume in the slug catcher never goes below zero. But what happens when the quantity becomes negative?
It is perfectly normal and valid for a numerical simulator to predict negative rates at an outlet boundary. When OLGA predicts negative rates at the outlet of the pipeline, the ACC variable may reduce in value from one time step to the next. When this happens, equation (2) will result in a reduction in the calculated slug catcher volume at a rate faster than the assumed drain rate. Effectively, the calculation does not prevent the possibility that liquid can leave via liquid drain as well as the inlet of the slug catcher. When you look at a schematic of a typical slug catcher, like the one shown below, it becomes apparent that this may not be such a sound assumption. The slug catchers are designed for gravity separation of phases and hence the inlet nozzles are at or near the top of the vessel. Once the liquids go in, they quickly settle to the bottom. Any negative flow is likely to be mostly gas with very little liquid carried as droplets in the gas phase.
Depending on your case, using the OLGA basis for calculation may result in significant errors. In one case, we found a 10% error at a specific drain rate. The problem is that the error is not on the side of conservatism. We think the correct way to write equation (2) is as follows: 
In equation (7), we added another max function that bounds the quantity , which is the average flow rate into the slug catcher for a given time interval, to zero.
If the calculation is being done at the outlet of a pipeline that is connected to a pressure node, set the parameter GASFRACTION to 1.0 in your NODE specification. This will ensure that whenever there is negative flow at the outlet boundary, the negative flow is all gas. That said, we still think equation (7) is a better way to perform the surge volume calculation because it works well regardless of the boundary specification.
The plot above shows a comparison of surge volumes calculated according to equations (2) and (7), labeled “OLGA Method” and “Proposed Method” respectively. We can see that filtering out the negative values results in larger surge volumes at lower drain rates. At large enough drain rates, the differences eventually disappear. Given surge volume calculations are performed in order to size the slug catcher, we believe that equation (2) is not conservative and therefore should not be used. Instead, our modified version represented in equation (7), which gives a more conservative estimate of surge volume, should be used.
As always, your comments and feedback would be much appreciated.
Recently, I saw a discussion of OLGA speed in the OLGA Users group on LinkedIn. The discussion starts with the question of why OLGA performs nearly the same on two different CPUs (an Intel Core i7 processer running at 3.4 GHz and Intel Core i5 processor also running at 3.4 GHz). This result is surprising and troubling because Core i5 is a considerably cheaper processor.
I have seen flow assurance companies buy expensive hardware in hope of making OLGA go faster. Unfortunately, the results of such expense have been hit-or-miss. As a budding flow assurance consultant, I witnessed one of those misses. After purchasing hardware that was very expensive, we found that OLGA ran just as fast as it was running on desktop machines that were one year old. Since then, I have spent quite of bit of time looking at OLGA speed and working on understanding what factors impact OLGA performance.
To help flow assurance companies considering such buying decisions, I thought it might be worthwhile sharing the knowledge I have gained through my investigation. Also, I thought it might be interesting to add some data and analysis to the discussion and look specifically at how the number of threads plays a role in OLGA speed. In the LinkedIn discussion, Torgeir Vanvik from Schlumberger offered some excellent insight into the way OLGA works, and I am hoping this post sheds more light on the topic of OLGA’s parallel performance.
Key factors that affect OLGA simulation speed
There are a several key factors that affect OLGA simulation speed. Some have to do with the numerical modeling complexity and others have to do with the hardware on which OLGA is run.
On the modeling side, the most obvious factor is the complexity of the network being modeled. In general, single branch models run faster than networks and simple converging networks run faster than diverging networks or networks with looped lines. Unfortunately, this is not something flow assurance engineers can control so it is not worth discussing it further.
Next on the list are the section lengths and numerical time step. In OLGA, the simulation time step is controlled using the parameters MINDT and MAXDT in the INTEGRATION specification and also using the DTCONTROL parameters. To ensure model stability, simulations are typically run with the CFL condition controlling simulation time step. The CFL condition determines how much distance, relative to the length of a section in the model, the fluid in that section is allowed to move in one time step. The net effect is that the longer the section length, the longer your time steps are allowed to be, and vice versa. The INTEGRATION and DTCONTROL parameters along with section lengths have a profound impact on model speed. The model speed is typically governed by the smallest section in the network. I can write a whole treatise on this but that is a topic for another day.
The model speed is typically governed by the smallest section in the network
On the hardware side, the key factors that affect simulation speed are CPU and I/O speed.
Modern CPUs have two specifications that are important for our purposes – clock speed and number of cores. The clock speed is indicative of how many instructions are processed per second, and the number of cores indicate how many instructions are processed in parallel. Modern versions of OLGA (6 and above) are able to exploit the power of multiple cores whereas older versions of OLGA (5 and below) get no benefit from multi-core processors.
No matter what the version of OLGA, clock speed is important. Ultimately, it comes down to how many instructions can be processed per second, so the GHz of the processor (the bigger the better) is important.
No matter what the version of OLGA, clock speed is important
For OLGA 6 and later versions, the number of cores will also play a role in the speed. However, it is easy to fall into the trap of believing that more cores will result in faster simulation speeds. The unfortunate reality is that some tasks benefit from being processed in parallel while others don’t. If the time to split the task into small problems is greater than the time savings resulting from parallel processing, the task will actually run slower. In other words, depending on the problem there is a theoretical limit to the gains from parallelization. This is also true for OLGA.
To answer practical questions like, “Is it better to have a 3.4 GHz, 4-core CPU or a 2.4 GHz, 16-core CPU?” requires some investigation into OLGA parallelizability. In fact, we explore that very topic later in this post.
Depending on the problem there is a theoretical limit to the gains from parallelization
Since OLGA outputs simulation results to the disk as it is running, the speed at which it can write out the results can limit (sometimes severely) the run-time speed. There are two common hardware bottlenecks, the hard drive speed (when OLGA is saving locally) and the network bandwidth (when OLGA is writing to a network drive).
Most commercial-grade desktop computers and laptops ship with mechanical hard-drives that spin at 5400 or 7200 rpm, while server-grade machines often come with 10k or 15k rpm drives. The read/write access speed scales directly with the spin speed of the drive. In general, the greater the spin rate the better the hard drive when it comes to OLGA speed. Solid state drives (SSDs) are now also available cheaply and the technology has matured enough to be used in a commercial setting. However, the speed of SSDs range from worse than mechanical drives to exceptionally fast depending on the manufacturer and model. In other words, not all SSDs are as blazing fast as they would have you believe so choose carefully. It is also important to consider the computer bus interface which determines the internal data transfer rates (though these days that interface is rarely the bottle neck). Ultimately, the hard drive performance can be as important to simulation speed as the CPU.
Ultimately, the hard drive performance can be as important to simulation speed as the CPU
When saving OLGA results to a network share, the network can also limit the ability for OLGA to write simulations results. As a result, companies should ensure that the bandwidth between the computer running OLGA and the network storage is as large as possible. This will alleviate any slowdowns in OLGA speed.
When saving OLGA results to a network share, the network can also limit the ability for OLGA to write simulations results
These bottlenecks can also be avoided most of the times by carefully considering the frequency and quantity of simulation outputs.
In order to understand the factors that influence parallel speedup, we used 8 different model configurations.
Number of sections
Smallest section length (m)
Single pipeline model
Single branch – fine mesh
Single branch – coarse mesh
Converging pipeline network
Converging network with pressure-pressure boundary
Converging pipeline network – no flow
Converging-diverging network (Loop)
Two separate networks
All models were run with no trend and profile outputs to eliminate the effect of I/O on parallel speedup. To ensure the results we repeatable, each model was run multiple times utilizing a varying number of threads. A simple program was developed to run each model up to 20 times in 10 minutes (which ensured all models ran at least 2 times and many ran the full 20 times). The average run time was then calculated for each model and thread combination. It is worth noting that the run times for each simulation iteration were nearly identical. OLGA 2014.2 was used for this study (see acknowledgments at the end). The following command was used to manipulate the number of threads used by OLGA. A thread is a part of a computer program that can be managed separately by the operating system. A single core in a modern CPU can handle two threads.
opi.exe /t <num_threads> <input_file>
All simulations were run on a machine with 4 physical cores and capable of running 8 threads in parallel.
The first plot shows the speedup achieved by the various models. The ideal speedup line shows that a model using n threads should be able to achieve a speedup of ‘n’ compared to the 1 thread model. Note that without specifying the number of cores when running OLGA, the default number of threads is based on the number of CPU cores (in our case that is 4).
The plot above shows that the best performing model achieves a speedup of 3 using 4 threads, and a speedup of 4 using 8 threads. The worst performing models cap off at a speedup of ~1.6 and achieve no additional speedup beyond 5 threads. In fact, speedup of few of the models reduce when going from 7 threads to 8. However, this last artifact could be a result of using all available threads on the processor leaving the OS to switch between the computational load and background services running on the OS. We can only confirm this if we ran the test on an 8- or 16-core machine.
Another way to look at speedup is to look at a quantity called parallel efficiency which is the ratio of the actual speedup to the ideal speedup.
These two plots show that the parallel speedup tends to stagnate beyond 4 threads for most models. Most models are able to achieve a speedup of 2 or more when using 4 threads. However, by the time we get to 7 threads, only one model has a parallel efficiency of over 50%. In other words, we would be better off running two simulations simultaneously using 4 threads each, rather than running just one simulation using all 8 available threads.
Parallel speedup tends to stagnate beyond 4 threads for most models
The parallel speedup and efficiency plots showed that the efficiency of parallelization varied between various model types. So the next question is what makes a model more or less parallelizable. The flow chart below shows a simplified program structure of a parallel program.
In OLGA, the main calculation loop would be the time loop that marches time from start to the end of the simulation. The initial sequential process would be reading input files, tab files, etc. The final post-processing might include closing file handles, releasing memory, etc.
With that background in mind, we curve fitted the parallel efficiency curves with an exponential function of the following form:
I call the calculated factor the parallel efficiency decay factor. We can then plot the decay factor as a function of various aspects of the model. Our analysis shows that the decay factor is a strong function of the model runtime and the number of sections in the model.
The plot above shows that the parallel efficiency is loosely a logarithmic function of the model run time. This makes sense and follows readily from the way parallel efficiency is formulated above. (and Amdahl’s law). Skipping some math jugglery, can be rearranged to the following equation:
When , there is hardly any speedup, yielding a parallel efficiency of and when , yielding a parallel efficiency of 1. In between, we get a log-linear relationship.
The plot below shows the parallel efficiency decay factor as a function of number of sections in the model. As the number of sections increase, parallel efficiency gets better. Note that at 7000 sections, the decay factor is ~-0.1, which is probably close to the theoretical limit based on the strictly sequential parts of the simulation.
This also makes sense based on the fact that the number of computations performed in each time step is directly proportional to the number of sections and these are the computations that are computed in parallel according to the OLGA manual. So, higher the number of sections, better the parallel efficiency. However, there is a limit to the parallel efficiency as there are always sequential parts of the algorithm that cannot be parallelized.
Higher the number of sections, better the parallel efficiency
To sum it up…
Getting back to the discussion of hardware choices and their impact on OLGA speed, number of cores, clock speed and I/O speed are all significant factors. Recent versions of OLGA are multi-threaded and have the ability to run faster by utilizing multiple processor cores. We did a detailed analysis on how number of cores can impact OLGA speed and whether it is prudent to spend money on cores.
OLGA defaults to using as many threads as the number of cores available. In our analysis, the best speedup we achieved with 4 threads was ~3, a 75% parallel efficiency. In general, the more compute intensive the simulation, the better the speedup. For short simulations, multi-threading did not help. Even for a long simulation with 7000 sections, going from 4 threads to 8 threads only bumped up the speedup from 3 to 4. In general the parallel efficiency tapers off as we go beyond 4 threads. Based on our analysis, I reckon that 4 threads is a sweet spot for running flow assurance models in OLGA. You could of course fiddle with this for individual models but I would not recommend spending time on it.
Four threads is a sweet spot for running flow assurance models in OLGA
Keeping in line with our findings, the OLGA manual advises that it is better to use the available cores for simultaneous simulations rather than using them to speed up an individual simulation. However, this advice is a bit naive. For example, most professional desktop or laptop systems today have 4 cores but do not have the hard drive access speeds to support 4 simultaneous simulations writing data. The right choice lies somewhere in the middle.
If you are making a hardware buying decision I would not go beyond 4 cores when buying a computer with a mechanical hard drive. If you have enough OLGA licenses and want to centralize your simulations on one machine, the storage choice is as important as the processor choice. I would also recommend setting OMP_NUM_THREADS environment variable to 4 in order to run OLGA at an optimum parallel efficiency.
We welcome you to share your experiences and provide us feedback. If there is enough interest, we will explore the effect of CPU clock speed and disk I/O in detail in future posts.
We thank Dr. Ivor Ellul and RPS Group for running OLGA simulations and for valuable suggestions related to the analysis presented here.