How fast can I replicate data?

In this post

Very eye-catching question, isn't it? It really depends on what you are trying to replicate, on which infrastructure and how accurate you've been in studying its behaviour.

Before you study your system, I suggest getting familiar with some concepts of queueing theory I tried to make simple.

Have a quick look at the following Wikipedia articles:

just to grab an idea of what's the point.

Then search "m/m/c calculator" on Google and find this.

Yes, you'll need some GoldenGate concepts sooner or later, but NOT here.

The queues

Logical replication is more or less like a queue system:

  • you offer traffic - the changes you want to apply to the destination database;
  • someone serves requests - if you use GoldenGate you can do it in parallel, hence the "c" in "M/M/c";

I'll never spend a second to justify "M/M", let's consider it reasonable.
I won't bother you with maths: it's been too long since I studied it. Let's limit ourselves to play with this simulator:

  • click "M/M/c"
  • we have a system with 64 servants, so let's put 64 in the "c"  field
  • we offer 32 requests per time unit, then let's write 32 in the "λ" field
  • each servant has the capacity to process 1 request per time unit, so, the average service time is 1: let's put in the "μ" field 

To summarize, we're offering 32 requests per time unit to a system that is capable of servicing 64. You'd expect each request to stay in the system exactly the time it takes to service it: it does.

Let's now offer 63, which is less than 64: you'd expect things to be serviced as quickly as above. They don't: they stay in the system 1.8557 time units.

The lesson we learnt is that to be responsive, we must keep offered traffic way below the system's capacity.

Second lesson. This is the Pollaczek-Khintchine formula (M/G/1):

L = \rho + \frac{\rho^2 + \lambda^2 \operatorname{Var}(S)}{2(1-\rho)}

It teaches us two important things:

  • the closer you get to servants' capacity (ρ tends to 1), the longest requests stay in queue (and this is lag for us)
  • the highest the variance in service times "Var(S)", the longest requests stay in queue (and this is lag for us, to be petulant...)

In other words:

  • the first point is the confirmation of what we simulated: keep far away from the limits;
  • the second point isn't so obvious: service times varying a lot imply long queues, so we must keep service times constant, regardless of the request!

Staying far away from limits (e.g.: having no bottlenecks) is defintely not enough. Don't fool yourself with bottlenecks.

Stay tuned and see how to stay far from variance and limits in a real world scenario! Yes, there will be some GoldenGate in there, at last...


Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Enrico Brambilla is an Oracle Core Technology Expert, member of Oracle Consulting. He has worked with over a hundered big and small customers across the years, gathering thoughts to share about Database, e-BS, Exadata, GoldenGate and more - not "just" technology.

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today