Jon Masamitsu's Weblog

  • Java
    February 11, 2013

The Unspoken - Phases of CMS

Guest Author
CMS (-XX:+UseConcMarkSweepGC) or the concurrent mark sweep GC could have been called the mostly concurrent mark sweep GC and here's why.

These are the phases of a CMS concurrent collection.

1. Initial mark. This is the the start of CMS collection. This is a stop-the-world phase. All the applications threads are stopped and CMS scans the application threads (e.g. the thread stacks) for object references.

2. Concurrent mark. This is a concurrent phase. CMS restarts all the applications threads and then starting from the objects found in 1), it finds all the other live objects. Almost (see the remark phase).

3. Precleaning phase. This is a concurrent phase. This phase is an optimization and you don't really need to understand what it does so feel free to skip to 4. While CMS is doing concurrent marking (2.), the application threads are running and they can be changing the objects they are using. CMS needs to find any of these changes and ultimately does that during the remark phase. However, CMS would like to discovery this concurrently and so has the precleaning phase. CMS has a way of recording when an application thread makes a change to the objects that it is using. Precleaning looks at the records of these changes and marks live objects as live. For example if thread AA has a reference to an object XX and passes that reference to thread BB, then CMS needs to understand that BB is keeping XX alive now even if AA no longer is.

4. The remark phase. The remark phase is a stop-the-world. CMS cannot correctly determine which objects are alive (mark them live), if the application is running concurrently and keeps changing what is live. So CMS stops all the application threads during the remark phase so that it can catch up with the changes the application has been making.

5. The sweep phase. This is a concurrent phase. CMS looks at all the objects and adds newly dead objects to its freelists. CMS does freelist allocation and it is during the sweep phase that those freelists get repopulated.

6. The reset phase. This is a concurrent phase. CMS is cleaning up its state so that it is ready for the next collection.

That's the important part of this blog. I'm going to ramble now.

I recently talked to some users who, of course, knew how their application worked but who also didn't know so much about how garbage collection worked. Why should they. That's my job. During our chat I came to appreciate much more the fact that there were things that I habitually left unspoken. So, here's one of the unspoken truths that I forget to explain: The phases of CMS.

Join the discussion

Comments ( 4 )
  • Blair Zajac Tuesday, February 12, 2013

    Thanks for writing this.

    I'm curious on how does the JVM stop all the threads? Is there a global flag that each thread checks every so often?



  • guest Tuesday, February 12, 2013

    The VM has a safepoint mechanism (brings the application to a halt for things such as GC) that uses a protected page (not readable, I think) to interrupt all the application threads. Not specifically a GC thing so I don't know the details. There is safepointing code inserted in the interpreted and compiled code that try to touch the page and cause the application thread to be interrupted.

  • guest Wednesday, February 13, 2013

    Very nicely written Jon, as always. I'm sure I'm not alone in saying we appreciate you taking the time to share the details. I am looking forward to a writing / description along with the gory details of "perm gen removal" in a future blog. ;-)

  • Eugeny Osadchuck Wednesday, February 20, 2013

    >>Very nicely written Jon, as always


    Excellent explanation. Thank you lot.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.