The Unspoken - Phases of CMS

CMS (-XX:+UseConcMarkSweepGC) or the concurrent mark sweep GC could have been called the mostly concurrent mark sweep GC and here's why.

These are the phases of a CMS concurrent collection.

1. Initial mark. This is the the start of CMS collection. This is a stop-the-world phase. All the applications threads are stopped and CMS scans the application threads (e.g. the thread stacks) for object references.

2. Concurrent mark. This is a concurrent phase. CMS restarts all the applications threads and then starting from the objects found in 1), it finds all the other live objects. Almost (see the remark phase).

3. Precleaning phase. This is a concurrent phase. This phase is an optimization and you don't really need to understand what it does so feel free to skip to 4. While CMS is doing concurrent marking (2.), the application threads are running and they can be changing the objects they are using. CMS needs to find any of these changes and ultimately does that during the remark phase. However, CMS would like to discovery this concurrently and so has the precleaning phase. CMS has a way of recording when an application thread makes a change to the objects that it is using. Precleaning looks at the records of these changes and marks live objects as live. For example if thread AA has a reference to an object XX and passes that reference to thread BB, then CMS needs to understand that BB is keeping XX alive now even if AA no longer is.

4. The remark phase. The remark phase is a stop-the-world. CMS cannot correctly determine which objects are alive (mark them live), if the application is running concurrently and keeps changing what is live. So CMS stops all the application threads during the remark phase so that it can catch up with the changes the application has been making.

5. The sweep phase. This is a concurrent phase. CMS looks at all the objects and adds newly dead objects to its freelists. CMS does freelist allocation and it is during the sweep phase that those freelists get repopulated.

6. The reset phase. This is a concurrent phase. CMS is cleaning up its state so that it is ready for the next collection.

That's the important part of this blog. I'm going to ramble now.

I recently talked to some users who, of course, knew how their application worked but who also didn't know so much about how garbage collection worked. Why should they. That's my job. During our chat I came to appreciate much more the fact that there were things that I habitually left unspoken. So, here's one of the unspoken truths that I forget to explain: The phases of CMS.

Comments:

Thanks for writing this.

I'm curious on how does the JVM stop all the threads? Is there a global flag that each thread checks every so often?

Thanks,
Blair

Posted by Blair Zajac on February 11, 2013 at 09:54 PM PST #

The VM has a safepoint mechanism (brings the application to a halt for things such as GC) that uses a protected page (not readable, I think) to interrupt all the application threads. Not specifically a GC thing so I don't know the details. There is safepointing code inserted in the interpreted and compiled code that try to touch the page and cause the application thread to be interrupted.

Posted by guest on February 12, 2013 at 07:20 AM PST #

Very nicely written Jon, as always. I'm sure I'm not alone in saying we appreciate you taking the time to share the details. I am looking forward to a writing / description along with the gory details of "perm gen removal" in a future blog. ;-)

Posted by guest on February 12, 2013 at 04:16 PM PST #

>>Very nicely written Jon, as always
Indeed.

Excellent explanation. Thank you lot.

Posted by Eugeny Osadchuck on February 20, 2013 at 01:21 AM PST #

Post a Comment:
Comments are closed for this entry.
About

jonthecollector

Search

Categories
Archives
« April 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today