Welcome!

Apache Authors: Carmen Gonzalez, Liz McMillan, Elizabeth White, Pat Romanski, Christopher Harrold

Related Topics: Java IoT, Industrial IoT, Microservices Expo, Open Source Cloud, Machine Learning , Apache

Java IoT: Article

Performing Under Pressure | Part 2

Collecting and visualizing load-test performance data

In part 1 of this article, we covered writing web app load tests using multi-mechanize.  This post picks up where the other left off and will discuss how to gather interesting and actionable performance data from a load-test, using (of course) Traceview as an example.

Description: oad-test

The big problem we had after writing load tests was that timing data gathered by multi-mechanize is inherently external to the application. This means it can tell us the response times of requests when the app is under load but doesn't identify bottlenecks or configuration problems. So we need to be gathering a bit more data about how the internals of our web application respond to the workload.

For this article, I'll be using Traceview's instrumentation, which is installable from OS-native packages and, in the case of Reddit, takes care of instrumenting nginx, pylons, SQL queries, memcache calls, and Cassandra calls automatically.

Test 1: ramp-up read threads

So, this test is going to run for 30 minutes and generate steadily increasing read-oriented loads on various pages.  Like in a cooking show, I've taken all the waiting out, so let's skip straight to the results!

What we're looking at here is the performance of the deafult open-source Reddit install under a steadily increasing read load, broken down by layer of the stack:

Description: oad-test

At first, it performs like a champ. But as the number of concurrent users rises over time, we see that requests slow down. In fact, it looks like we are spending a lot of time per-request in nginx.

We also have access to machine metrics here (blue bar at bottom), so I've pulled up the load on the box. Our machine is bored-the max load the machine reaches is only 1.06-but it's serving slowly! This is a sign that we might not have enough worker threads in our application layer.

In fact, the default Reddit install only sets up a single uwsgi worker. So, let's fix that, and move on. Here's what it looks like with 10 uwsgi processes, same workload:

Description: oad-tests

It seems that we've traded our uwsgi queuing problems for an overloaded machine, but at least it's fully utilizing the hardware now-and our throughput is much greater!

Test 2: ramp-up write threads

This test will vote and submit comments on a particular thread with inceasing numbers of logged in users.  Ok, go!

Description: oad testing

One really interesting thing is that we can see there are two distinct trends in the data-one band grows slower faster than the other.  Selecting them for comparison, we can see that the slower band is for rendering the comments, while the faster one is the POST requests for commenting/voting:

Description: oad-test

We might have expected to see contention for the database (in this case, postgres).  However, by pushing the limits with our load tests, we figured out that the actual limiting factor will be cores on our app servers (or, in this case, server) before we have to worry about the database.  Here's what the breakdown by layer of stack looked like-note that we're spending almost no time in our database calls (measured through sqlalchemy and the Cassandra client):

Description: oad-testing

Where to go from here:

  • Performance testing is not only valuable to ensure that a new Web app meets projected demand; it can also be part of your CI system to detect performance regressions during everyday development. Here's a screencast about getting performance tests running in Jenkins.
  • If your website is particularly AJAX-heavy, you may also want to do load testing that simulates a browser better and execute JavaScript in order to create the exact load patterns that users will. This makes testing significantly more resource intensive as it requires spinning up headless browsers, but can be accomplished using selenium or a hosted selenium service.
  • Tracelytics performance monitoring and analysis isn't only for load tests; most of our customers run our lightweight instrumentation in production as well as development environments.

Related Articles

Performing Under Pressure, Pt. 1: Load Testing with Multi-Mechanize

Profiling Python Performance Using lineprof, statprof and cProfile

Solving Two of the Most Common Performance Mistakes

More Stories By Dan Kuebrich

Dan Kuebrich is a web performance geek, currently working on Application Performance Management at AppNeta. He was previously a founder of Tracelytics (acquired by AppNeta), and before that worked on AmieStreet/Songza.com.

IoT & Smart Cities Stories
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
@CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Two weeks ago (November 3-5), I attended the Cloud Expo Silicon Valley as a speaker, where I presented on the security and privacy due diligence requirements for cloud solutions. Cloud security is a topical issue for every CIO, CISO, and technology buyer. Decision-makers are always looking for insights on how to mitigate the security risks of implementing and using cloud solutions. Based on the presentation topics covered at the conference, as well as the general discussions heard between sessio...
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settl...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
Data Theorem is a leading provider of modern application security. Its core mission is to analyze and secure any modern application anytime, anywhere. The Data Theorem Analyzer Engine continuously scans APIs and mobile applications in search of security flaws and data privacy gaps. Data Theorem products help organizations build safer applications that maximize data security and brand protection. The company has detected more than 300 million application eavesdropping incidents and currently secu...