Welcome to the Oracle CX blog:
The latest in customer experience strategy, technology, and innovation.

LinkedIn’s Elegant Solution to Monitoring Network Performance

I've been working with Oracle Service Cloud since 2007 (back when it was still RightNow). For the past five years I've managed LinkedIn's Oracle Service Cloud implementation, handling everything from standard administration tasks and managing workspaces, workflows and business rules to doing C# development for Add-Ins.

Recently, I had an "aha" moment. I discovered a way to monitor network performance for Oracle Service Cloud (and beyond!). This solution changed the game for LinkedIn, and I wanted to share in hopes it might help others.

Challenge: Doing Root Cause Analysis of Performance Complaints

LinkedIn has been using Oracle Service Cloud for several years across our contact centers, located in primarily in the U.S., Dublin, Singapore, and Bangalore. During this time our global team has reported that Oracle Service Cloud was s-l-o-w or crashing
from time to time.

Our Oracle Service Cloud administration team would investigate these issues, but it was extremely difficult to know where the issue originated. Was it a problem with Oracle Service Cloud, the internet connection for a specific LinkedIn location, or the agent's computer? Were our scheduled reports and system utilities running at 12am CST causing performance issues for our Bangalore team? To make things harder our administration team is primarily based in the U.S., so we couldn't always troubleshoot our global team's issues real-time.

Despite our best efforts at resolution we didn't have a consistent, efficient solution for identifying the root cause of Oracle Service Cloud performance complaints. It was an ongoing headache and black hole of effort for our team.

Solution: Create Network Performance Monitor Add-In

During a discussion around our need for visibility into network performance issues, a thought occurred: the Add-In framework was always running in the background within Oracle Service Cloud and I could set up an Add-In to create an ongoing log of network performance metrics across all of our office locations and remote agents.

Diagram of LinkedIn's Network Performance Monitor Add-In SolutionI spent a day designing, building and testing a solution using:

  • A custom object to store a wide array of network performance data
  • A lightweight Add-In that would ping five globally accessible sites: Oracle Service Cloud chat server, Oracle Service Cloud production server, LinkedIn.com, Google.com and Facebook.com. The Add-In starts logging when the user logs in and runs in the background every 15 minutes until they logout. I tried to find a balance between storage space created by these logs and having enough data to be useful. For example, an hour between each monitoring cycle is too long to determine how the network was
    performing at the time of a complaint.

This solution generates several data points on network performance across five sites
for the entire day (because users start at different times) for every Oracle Service
Cloud agent worldwide at LinkedIn.

Benefits: Faster Root Cause Analysis, Improved Performance and More!

Now when we receive a complaint about Oracle Service Cloud performance, we can easily check the logs from our Network Performance Monitor Add-In and quickly tell if a network issue is/was the cause. We can also then identify if it was isolated to the Oracle Service Cloud servers, network issues at a specific LinkedIn office, or only impacting a specific agent's local machine.

Having consistent, high-level visibility into network performance saves my team countless hours of troubleshooting! Instead of working with IT and 15 other teams trying to gather and analyze data, we now have a good starting point and can identify the root cause significantly faster.

While this customization was designed to help with troubleshooting, we've experienced many other benefits, including:

  • We understand how ping times vary across locations and have clearer expectations on performance across different channels (e.g. "How fast is chat in Bangalore?"). We can run performance benchmarking across locations.
  • We don't have to ask users to send screenshots, run network traces, try other browsers or applications. Ironically, we have many of people say, "No, everything else is running fine," but the network performance monitor logs tell a different story!
  • We submit fewer service requests to Technical Support unless it's an Oracle Service Cloud-specific issue.
  • We share network performance data from our logs with our local IT teams when there are issues, so they can pinpoint those specific timeframes in their own logs to see what is going on. We can use this data to drive network performance improvements in our support locations.

Since we can nail down the source of performance complaints, we've had fewer inaccurate reports of Oracle Service Cloud performance issues, and we have saved a significant amount of time on troubleshooting issues ultimately not related to Oracle Service Cloud! In short, this relatively simple Oracle Service Cloud customization has created huge value.

Advice: Sharing A Few Tricks of the Trade

If you're interested in creating a Network Performance Monitoring Add-In for your organization here are some things to keep in mind:

  • First, have a clear picture of what you're doing and why. The specifics of our solution might not make sense for your organization.
  • Timer events are your friend. You can configure it to kick off every minute, five minutes, 15 minutes, etc.
  • Pick an Add-In that is always running, instead of a conditional Add-In (e.g. report Add-Ins can only run when the related report is open). I used a Navigation Section Add-In that loads as soon as someone logs in and continues to run the entire time in the background - the user doesn't even know it's there, and it has no performance impact if you (correctly) use threading / tasks.
  • The .NET framework allows you to set up a server config variable in Oracle Service Cloud which gives admins the ability to change those property values without requiring a developer. I used the server config variable to set our timer event interval which allows the admin team to change the default value from 15 minutes to one, five, or 10 minutes, etc. This also enables the admin team to change the list of URLs the Add-In pings on the fly. They can make adjustments based on specific business scenarios without having to engage a developer to make these changes in the code.
  • Test, test, test! Verify your Add-In is working like you expected by plugging in sample data and watch it work. Manually verify that the data being collected matches your expectations in testing.

I hope this helps other organizations who may be struggling with similar issues and encourages you to take a step back when faced with common challenges and look for an entirely different solution. I'd love to hear your feedback on our solution or any other ways you've effectively dealt with this sort of challenge.

Join the discussion

Comments ( 15 )
  • Adam Yavner Wednesday, May 9, 2018

    Love this, I have a few people in mind I plan to share this with, thanks!

  • Edson Junior Thursday, May 10, 2018

    Awesome Dan! Thanks for sharing this...

  • Anuj Behl Thursday, May 10, 2018

    Thanks for sharing Dan. really helpful.

  • Michael Locurcio Thursday, May 10, 2018

    Love the idea and how you use the data in your troubleshooting! Can you tell us more
    about what data you store in your COs?

  • Robert Pozderec Thursday, May 10, 2018

    Thanks for sharing!

  • Makarand Malandkar Monday, May 14, 2018

    Really useful post. Thanks for sharing Dan.

  • Gigi Wednesday, May 16, 2018

    Awesome, DanO! I appreciate you taking the time to share.

  • Ammar Aldaffaie Wednesday, May 23, 2018

    This is awesome! Thank you for Sharing!

  • Ivan Abaitey Friday, May 25, 2018

    I enjoyed reading this with my team. Thanks for sharing.

  • Andrew Wooster Monday, June 4, 2018

    Makes perfect sense, thank you for sharing!

  • Jens Monday, June 18, 2018

    This is great Dan - thanks for sharing!

  • Jess Campbell Thursday, July 26, 2018

    Sweet idea! We mainly just make our users do quick network checks when an issue occurs.

  • Dawn M. Smith Monday, August 13, 2018

    Would you share approx. how many agents you have.. looks like a great solution but
    we have a lot of agents here and wondering about being able to scale....

  • DanO Monday, August 13, 2018

    Hey Dawn - we have around 600 active users. So far there have been no issues with
    performance, and the data has been incredibly helpful.


  • Trine Larsen (@trine.larsen) Thursday, December 19, 2019
    This is actually really interesting! I will absolutely keep this post in mind.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.