Introduction
In our last blog post, we discussed the concept of Gradual Rewrite, a method for reducing technical debt in large code bases. However, theory only gets you so far - it’s important to also put it into practice. In this post, we’ll share our experience using the Gradual Rewrite method to tackle the technical debt of a business-critical Android library. At DISQO, we are constantly thinking about how we can improve our products for our users, as well as making the product development experience as smooth as possible. We consistently research new technical solutions and listen carefully to feedback from our users to make sure our products meet their needs.
What is the Surf to Earn SDK?
We implemented this strategy on the DISQO behavioral SDK, Surf to Earn, for Android. The SDK serves as a feature of the Survey Junkie platform, providing members with a way to earn more points (and thus more money) by sharing some of their browsing history with market research companies to receive more relevant surveys and additional points. The collection is enabled only if a Survey Junkie Android app user provides explicit consent. Once the user has given the green light and downloaded the SDK, it goes to work, sending events such as page visitations, e-commerce events, ad impressions, and media consumption to the DISQO CX platform.
The behavioral data is sourced from multiple places, but essentially the challenge can be described as a task of mapping a list of inputs to zero or more behaviors.
Example
class State( val behaviors: List<Behavior> /* Other interesting fields required to detect future behaviors */ )
fun <T : Input> find(event: T, previousState: State?): State
var currentState: State? = null override fun <T : Input> onInput(input: T) { find(event, currentState) .also { currentState = it } .behaviors .forEach { behavior -> // Process a behavior } }
|
We cannot simply batch a bunch of Inputs into a list and pass that to our find
function, since there are certain Inputs that don’t preserve their history. For that reason, we need to persist relevant information in some kind of State
. This State
is then used to retain the data necessary for detecting future behaviors.
Gradual rewrite
Initially, the Survey Junkie app development team also worked on the SDK development. However, in late 2021, we separated the SDK development from the main app and entrusted its management to a newly formed team. As with any transition, it required careful planning and execution to ensure a seamless handover of responsibilities.
Codebase analysis
As DISQO has grown, so too has the need for a codebase that is easily understood by a broader team. The SDK, which collected behavioral data, had unique needs that had outgrown the original code. Our challenge was to analyze the existing codebase so that we could develop a new one that supported the library’s architecture and performance needs, with less reliance on delegation, inheritance, and mutable state.
Choose a subsystem
This was the easy part since the code was nicely divided into mostly decoupled subsystems that resulted from the nature of the behavioral data collection. The behaviors that we are interested in are emitted from certain target apps on the device. For example, page visitations come from browsers, ad impressions come from social media apps, etc. Since the code was divided into subsystems per target app, we could divide the rewrite process to rewrite one target app at a time.
Identify metrics
Fortunately, we weren’t starting from square one when it came to measuring behavior detection accuracy. We already had some metrics in place, but we wanted to enhance them. We ramped up the metrics collection by including attribute fill rates and adding more granular breakdowns. This allowed us to easily pinpoint differences between SDK versions and specific target apps.
Automate test creation
As mentioned above, we needed to develop a new codebase. Although the previous code had very high unit test coverage, it wasn’t testing how it performed against real Inputs, since almost everything was mocked. That made changing the code a very brittle experience.
We brainstormed how we could create an environment that would improve our confidence in changing the code. We quickly realized that it was possible to persist all the Inputs to the disk. This would allow us to replay the same data over and over again, ensuring we didn’t create regressions when we worked with the code.
We started by writing an InputRecorder that would store all the Inputs and behaviors seen during the device usage session.
Mock time
Correctly replaying the stored events turned out to be more complicated than we anticipated. The main reason was that the code used a lot of delays and time limiters such as:
var lastInvoked = 0L fun runIfThresholdPassed(threshold: Long, action: () -> Unit) { val elapsed = System.nanoTime() - lastInvoked if (elapsed > threshold) { action() lastInvoked = System.nanoTime() } }
|
This forced us to persist the time it took to execute certain code blocks and mock the time in the unit tests. We also needed to increment it based on the time in the persisted Inputs themselves, which made the tests more fragile. Nevertheless, we decided early on that the new code would not rely on any time-based optimization, which meant we could discard the need to persist execution times in the future.
But there was a silver lining to our approach. By mocking time, we could execute the tests lightning-fast. If creating a test case - turning on the InputRecorder, and using the device to create behaviors - took 30 seconds of using the device, the actual test run took only a few hundred milliseconds.
Another issue we faced was that we wanted to run the tests as plain JUnit tests so that they would not need the overhead from Robolectric or Android Instrumentation tests. For that to happen, we could not use any Android-specific code in our behavior detection logic. Thankfully, this wasn’t such a big problem, since behavior detection mostly involves tree traversal, which is platform-agnostic.
Create tests
Creating tests was the most important part of our process. After we had figured out all the bits and pieces of how to persist the Inputs, we created the test cases. We could have deployed the InputRecorder to production, but we soon noticed it wouldn’t work very well for our use case. To create effective tests, we needed to visually confirm the behaviors that were triggered during test creation, and we couldn’t rely on actual users for that confirmation.
We took matters into our own hands and created the test cases ourselves. The test engineers, with help from developers, would enable the InputRecorder, open the target app, and use it as they normally would. They would then write down all the behaviors they initiated during the test creation process. Once they were finished, they would compare their list of behaviors to the list that the Surf to Earn SDK detected. If there were missing behaviors in the list, they were manually added.
Rewrite
After creating tests, the actual rewrite process was fairly straightforward:
- Take the good parts of the old code base.
- Remove the bad parts.
- Plug the good parts into a new simplified architecture.
- Run the tests and fix the code until all the tests pass.
- Test manually.
- If manual tests fail, record a new test case of that manual run and fix the code.
- Repeat step 5 until manual tests do not fail.
Release
Fortunately, the Play Store has great support for canary releases that we used in our release process. We decided early on to issue a release after rewriting one target app. This made it easier to track the metrics and react if we saw any regressions. We first released the update to five percent of our members and tracked the business critical metrics defined earlier. From there, it was easy to decide whether to halt the release or increase the rollout percentage.
Results
It took us almost a year to rewrite all the behavior detection-related subsystems. However, we weren’t rewriting all the time. At the same time, we were able to implement new features and add detection for previously unsupported behaviors. and . As a result, our year-over-year per-day behavior counts grew by a jaw-dropping 40%. And the cherry on top? Our automated test creation process gave us the confidence we needed to release new code with ease.
But did we actually succeed in making the code base “cleaner” and easier to maintain? Well, that’s a tough question to answer, as there is no standardized way to measure the maintainability of a code base. We decided to use a static code analyzer for Kotlin called Detekt that outputs certain metrics about code.
Even though we use this static code analyzer, we are open to other methods of measuring code maintainability.
Overall, the rewrite process was a huge success for our team. We were able to improve the SDK’s collection of behavior and our confidence in our ability to update the code. Furthermore, we were able to reduce the time to resolve bugs and implement new features to a fraction of what it used to be. This translates to faster development cycles and a more efficient workflow. he codebase itself also underwent significant improvements, with a 41% reduction in LOC, a 35% reduction in cyclomatic complexity, and a 56% reduction in cognitive complexity. While this is not a definitive way to measure code maintainability, these improvements give us a good indication that we’ve made our codebase cleaner and easier to maintain. Overall, the rewrite process was challenging, but a rewarding journey that has set our team up for future success.
Despite the success of our rewrite project, it is always good to keep in mind that paying tech debt is a difficult and a time-consuming task. If it is not done properly, the risk of regressions is high. However, when approached with a clear plan that identifies the risks and rewards and establishes a safe testing environment, a rewrite can pay dividends.