Future-proofing a next-gen platform for a leading data management player

Future-proofing a next-gen platform for a leading data management player


Exit Intent Pop Up

Request A Demo
Future-proofed a next-gen platform for a leading data management player
Visit Website
Tech stack
Polymer JS, Apache Storm, Kafka, Elasticsearch, Ngram, Kibana
Enabled by our Custom Engineering Pod

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Oops! Something went wrong while submitting the form.

Zemoso provided end-to-end digital transformation services for an industry leader in data management and governance platforms for product information management (PIM). The company’s platform was recognized several times by reputed industry analysts (such as Gartner and Forrester) for their leadership in this space. 

Our client wanted to evolve their existing platform into a more intuitive, smart version of itself and retain their industry stronghold.

Client’s ask

We co-built and future-proofed their industry-leading platform, in a way that allowed them to be more agile and solve data management problems of the future. This platform was ranked as Number 1 by both Gartner and Forrester. Specifically, this translates to:

   ●   Build the next generation, on-demand, cloud-native data management and governance platform

   ●   Evaluate, modernize, and validate the tech stack that’s underpinning the platform to solve key business problems

   ●   Design an architecture for the platform that is agile, scalable, and uses DevOps best practices

   ●   Have an API driven approach

Client wins

In six months, Zemoso helped launch a cloud-ready, minimum awesome product (MAP) version of the platform. The re-imagined platform played a crucial role in raising tens of millions in Series A funding from a VC firm. It also helped them continue to acquire and retain some of the largest customers across retail, CPG, healthcare, energy, and other sectors.

Our approach

The team aligned fast, pivoted as needed, tested rapidly, and delivered consistent, incremental wins throughout the partnership.

Next generation, cloud-native MDM platform with a modernized tech stack

We helped our client speed up development and evolution cycles, consistently delivering well-designed applications. After evaluating multiple providers, we chose Amazon Web Services (AWS) for its greater flexibility, scalability, and lowest developer friction with SDKs. In partnership with their internal engineering team, we designed a multi-layered platform using microservices architecture for a modular and agile system. The team used DevOps best practices for continuous integration and deployment to expand capabilities quicker.

Upgrading the tech stack

For each functionality, we evaluated many competing tech solutions and selected the best-suited solutions with the client’s engineering leadership.

User interface: Polymer JS was chosen to create the next-gen user interface with added controls for tiered access protocols. It eased setting up different application elements and their relations.

Events stream processing: The events stream processing (key to developing deduplication functionality) was built using Apache Storm, a highly vetted solution for real-time stream processing. The team conducted a thorough evaluation between Apache Spark and Apache Storm. Apache Storm was a better fit than Spark, which is a far more complex, general purpose computation engine.

Messaging layer: To ensure smart deduplication whenever a record is added or updated, Kafka messaging layer transfers the data between applications. It is a fast and scalable event distributor. The ‘deduplication’ and ‘match and merge’ queues were robust and processed high volumes of data with minimal downtime or data loss. It integrated seamlessly with Apache Storm for real-time streaming data analysis, which helped their clients gain faster insights.

Improved searchability: For search, the team chose Elasticsearch. It is a distributed, free and open-sourced search and analytics engine that powers search solutions for global giants like Microsoft, Netflix, Slack, and Uber. This text-based, NoSQL search tool proved highly useful in indexing data points needed to fulfill search parameters. The team used advanced indexing techniques like Ngram to generate superior match results. For instance, with only the first three letters as input, the search engine could match and reflect the product name.

Custom deduplication algorithm

As different operators, suppliers, and vendors enter product data into the platform, it is important to maintain the integrity of the product information. One key functionality that enables that is deduplication. Core functions provided by a retailer depend on these systems accessing and displaying accurate information. We helped our MDM client ensure that the same product is not listed twice under different unique ids.

Our team developed and deployed the deduplication algorithm on top of their newly upgraded tech stack. This algorithm showcased that the tech stack we evaluated and set up worked as intended.

The algorithm used similarity triggers around name, brand, color, etc. to flag potential duplicates with a probability index in a database of over a million records. Every change in the system is an ‘event’. Each event then goes through a ‘match and merge’ protocol, which is added to a queue. In the case of a suspect match, the event is assigned to a different queue to be manually resolved, thus maintaining the right records for critical business functions.

Reporting Dashboards

We co-created dashboards to improve access to learnings from the master data that the business can leverage. These were designed to help client’ customers visualize data and create intelligent reports for better analysis. Some of these dashboards helped analyze change trends, update summaries, workflow SLAs, governance summaries, and so on.

The team used Kibana, which is a browser-based analytics and search dashboard for Elasticsearch. This enabled quicker analysis and faster compilation of the data, and users could easily share these reports with stakeholders. It also helped detect learnings in Elasticsearch data with machine learning features. 


Benefits to the client were:

   ●   The client raised a large Series A round

   ●   Validated tech stack choices with faster feature launches

P.S. Since we work on early-stage products, many of them in stealth mode, we have strict Non-disclosure agreements (NDAs). The data, insights, and capabilities discussed in this blog have been anonymized to protect our client’s identity and don’t include any proprietary information.


P.S. Since we work on early-stage products, many of them in stealth mode, we have strict Non-disclosure agreements (NDAs). The data, insights, and capabilities discussed in this blog have been anonymized to protect our client’s identity and don’t include any proprietary information.