Open Source Software ​​Security and Management

Do you know which Open Source components exist in your source code?

3

Security Tool

Canvass for Security scans source code and identifies OSS components. Using OSS introduces security vulnerabilities risks. Our tool accurately identifies the specific project name and version that the OSS component originates from. Our tool will be orders of magnitude more accurate than other tools available in the market. We are currently working to integrate more OSS projects into the underlying Security Tool data set. 

For more information on how Canvass Labs can assist your company on open source software management, please contact us at info@canvasslabs.com.

Overview

Large organizations have legacy software systems with dependent software libraries. Tracking these dependencies and identifying newly discovered vulnerabilities in legacy systems is technically challenging and costly. Canvass for Security address these issues by providing a service to scan source code files. The service catalogues and combines the legacy and third-party software information with its corresponding vulnerability information. Then, it provides three levels of scanning to find the dependency software and known vulnerabilities. This approach is scalable due to the parallelization nature of data. In addition, the system provides explainable results to the end user to help understand what the tool has found.

Our Approach

Canvass for Security precisely shows which parts of software developer’s code are vulnerable so that they may quickly repair or replace it as necessary. Specifically, our tool addresses the challenges of dependency tracking, applying patches and version mapping. Furthermore, subscribing to Canvas Labs’ service allows developers to keep up-to-date with the latest vulnerabilities and their corresponding software packages and versions. In the following subsections,
we describe how we link software versions with their specific vulnerabilities, find dependencies for customer’s software, and scan it to match known vulnerable software.

How It Works

A. Linking known vulnerabilities with their origins

The vulnerability information in the National Vulnerability Database (“NVD”) and security forums typically does not contain exact URLs of software products or info on how programmers refer to them in the dependency management systems (although they share commonly used English names and versions). Typically, programmers or security officers themselves have to match up the names and version info of dependent software and their vulnerability information. To build automatic vulnerability scanning services, we need to collect and combine information from separate independent sources. Our system utilizes various keyword matching and natural language processing techniques to hone in and match against each database. 

In order to scale up to billions of records, our system utilizes big data (MongoDB) and search engine (Apache Solr) technologies for storing and indexing our data. In addition, the system processes each software package independently of each other, which is the perfect case for parallelization.

B. Dependency level scanning

Modern software development practices include third-party developed software components – i.e. dependencies. Dependency management tools (e.g. Maven, Gradle, pip, etc.) often do not track vulnerability information. Canvass for Security integrates the specific vulnerabilities from the NVD to each dependency found in the dependency management tool. Figure 2 illustrates how dependency information is aggregated using the users’ declared build files (such as POM.xml for Java), and how they are checked against vulnerability database.

C. File level scanning

Programmers also copy source code files or directories into their own source code. For those cases, Canvass for Security provides a file-level scanning service that detects the copied files and checks for vulnerabilities.

D. Function level scanning

For some programming languages, such as C and C++, programmers tend to copy functions rather than entire files, and alter the source code for customization. Once they alter the source code, text-matching techniques used in competitors’ products fail to match or identify the potentially vulnerable functions. Canvass for Security can still detect many of these altered cases because it employs layers of abstraction to extract software signatures.

Results

Figure 4 shows an example of finding a bug known as Meltdown. Canvass for Security catalogued vulnerable functions and the patched functions in the pre-processing step shown in Figure 1. It correctly identified the origin of the dependent software (e.g. package, file, and function names), and highlights the source code lines that cause the
vulnerability (in red) and the lines of the function that repair the vulnerability (in green). This type of evidence-based information helps the developers find, understand and reason about each vulnerability. From a performance perspective, the system takes less than 4 seconds to convert the source code text into 96 function signatures, compare against 1.1+ million function signatures and find two vulnerabilities on a two vCPU in 8 GB RAM virtual node in AWS.

Why Canvass Labs?

Our CEO and founder firsthand saw the challenges enterprises faced managing their Open Source Software (OSS) risks and was determined to find a better method to solve them. We are a team of computer scientists with decades of experience in machine learning, natural language processing and artificial intelligence. We are applying our deep knowledge in these areas with our own experiences in the OSS space to build a better way to identify and analyze OSS and the associated risks.

Through this work, Canvass Labs is laying out the foundation for a testbed to train AI models to identify and fix bugs in software. We are utilizing a large pool of data to train a specific AI model that will solve a class of common computer security problems. Canvass Labs is developing a test bed and technologies to fix the computer security problems of the future.

Copyright 2019 Canvass Labs.