
Nearly all of the artificial intelligence (AI) and machine learning (ML) models are currently developed using actual or representative data. There are often insufficient amounts of defense/intel data readily available as needed to create highly performant AI/ML models (e.g., it takes roughly 50M pieces of data to create a 60-70% performant model). In addition, synthetic data or virtually created sensor inputs can reflect rare events and scenarios where data is especially hard to collect, allowing AI to learn from both real and simulated experiences. Sensor synthetic data generation tools have the potential to improve the availability of training data, and representative scenarios required to produce effective ML models.
Patriot Labs is interested in large-scale, accurate, easily accessible training, test, and validation data to support AI model development for multiple domains. This CFI topic encompasses the development of a synthetic data generation tool for sensors (e.g., radar, wide area motion imagery, etc.). In the digital battlespace of the future, synthetic data generation techniques may help visualize unknown operating environments, generate data to train algorithmic models that analyze the operational picture, and enable improved AI capabilities.
For purposes of this CFI, solutions may incorporate technology that generates data and/or images for various targets of interest using physical or virtual models of those targets. Additionally, the synthetically generated data should be labeled as generated, thereby reducing human data labeling effort. These synthetic generated data must enable the training and testing of deep learning (DL) neural networks underlying classifier systems.
Solutions should provide methods for effectively expanding the corpus of sensor data and/or images for various targets of interest to defense and intelligence community analysts. This expanded corpus shall provide the ability to train image classifier neural networks for accurate classification of corresponding targets. Key approach tasks could include developing a deep neural network model that initially leverages existing sensor data classifier network models, training networks using the synthetic generated corpus and existing data, testing the network using both generated and existing hold out sensor data, testing the network using actual sensor data to report effects of synthetic training, and transitioning final algorithms into an operational architecture. Special consideration given to solutions that lead to the creation and integration of mission-focused synthetic data to include, but not be limited to: Electro-Optical (EO), Synthetic Aperture Radar (SAR), Wide Area Motion imagery (WAMI), Full Motion Video (FMV), Electronic Intelligence (ELINT) spectrums/waveforms, Variable Message Format (VMF), etc.
