Insight Platform—Technology Foundation for Code Insight

The Insight platform is a unifying framework for analysis of all types of system data. It provides the technology foundation for Code Insight, as well as Pattern Insight’s future applications for analyzing other types of system data. The platform combines unique data mining technology with domain specific analysis to enable advanced search and analysis on a wide range of system data inputs, with real-time response on huge data sets. The platform is highly extensible and pluggable, and delivers high performance and scalability.


Insight Platform—Foundation Technology

Parsers Tackle Any Kind of System Data

The Insight platform can handle any type of system data—source code, scripts, logs, and more—either through standard or custom parsers. For source code and scripts, there is no dependence on operating systems, devices, and compilers—given a code base it can go to work right away, and start scanning all the code under that root—regardless of the compiler used, the OS the code was written for, and whether the code belongs to the same project.

Data Mining Engines Enable Advanced Search and Analysis on Parsed Data

  • Search Engine: Enables advanced search operations on huge data sets, with results returned in seconds.
  • Mining Engine: Enables pattern discovery, i.e., the identification of similarities that are repeated frequently. For example, given a code base, the mining engine can quickly and accurately detect similar code segments that appear multiple times.
  • Analysis Engine: Enables in-depth semantic understanding of code. It builds in heuristic and statistic-based approaches to enable advanced analysis. It improves accuracy and minimizes false positives through pruning techniques.

Self-Learning Knowledge Base Supports Intelligent, Real-Time Analysis

  • The knowledge base stores domain-specific patterns generated by the data mining engines, as well as relevant metrics.
  • It also stores indexes which enable real-time response. The indexes are incremental, ensuring that the platform can scale to handle continuously updated system data of any size, without degrading performance.
  • It is self-learning—it can make inferences based on history, and can “learn” using training data, enabling activities like trend analysis.

Open Architecture and API Ensure Extensibility and Pluggability

The platform has an open architecture with a core set of services exposed through APIs. The open architecture and APIs simplify integration to systems like SCM repositories, bug databases, and IDEs. The platform is also extensible—customers or partners can build custom engines and applications for analysis of system data.

High Performance and Highly Scalable

The platform enables analysis with real-time response on huge volumes of system data by virtue of unique optimizations:

  • Distributed architecture that scales as you add hardware
  • Domain-specific algorithms that help prune the problem space for pattern discovery
  • Incremental indexes
  • Significant compression of replicated code