STARSCAN.AI
STARSCAN.AI
  • Home
  • FAQ
  • More
    • Home
    • FAQ
  • Home
  • FAQ

FAQ

AI Solution to analyze Technical/scientific documents accurately and efficiently.

STARSCAN.AI it is an AI-Powered Aerospace Documentation Explorer. 


STARSCAN.AI: The app receives the user input scrapes the website, processes several PDFs on the website, analyzes the PDF contents, and gives a structured response for a specific technical scientific report. Future expansions include multi-document, multi-modal analysis, multi-modal outputs, and enhanced security protocols.


Aeropace Industry - NASA


It is an AI-Powered Aerospace Documentation Explorer. NASA has a vast repository of millions of Technical and scientific documents which are complex and multidimensional. Inconsistencies in metadata loom, and missing or irrelevant data pose great challenges especially when we are trying to understand the technical scientific requirements. We developed STARSCAN.AI, an LLM-based AI solution that does URL scraping and document processing, that produces a technical report with accuracy, consistency, and relevance of these vital documents. Using the Streamlit web app interface, users can query or upload NASA docs. These docs are loaded, segmented, and transformed into vector embeddings. We've integrated Clarifai's GPT-4 for in-depth analysis. Pinecone library retrieves data efficiently, while PromptTemplates ensures our responses are structured and clear. The user control is ensured through API keys and Streamlit secrets, and the accuracy is controlled through similarity score. 


  • Complexity & Multidimensionality
  • Inconsistent Metadata Use
  • MISSING & IRRELEVANT DATA
  • Size & Computation Intensity dynamic 


User Input: Streamlit-powered querying and document upload.

PDF Processing: Load, split, and convert NASA documents.

Embedding Generation: Convert text to vector embeddings with SentenceTransformer.

LLM Setup: Clarifai's GPT-4 integration for deep text analysis.

Data Retrieval: Use Pinecone for efficient data fetching.

Structured Responses: PromptTemplates guide LLM to generate structured answers.


API Key Encryption: User controls using TOML-encrypted key protection and secret API keys.

Relevance-Driven Data Filtering: Displays data only if similarity score exceeds eg: 0.5.

Guided AI Responses: PromptTemplate directs LLM, mitigating unintended data exposure and avoid LLM hallucinations.

Negative Query Management: Handles off-topic questions, preventing irrelevant outputs.

Targeted Data Processing: Focuses solely on NASA Technical documents, minimizing data risks.

Structured Output: Prompt engineering for output in a given specific content format. 


Clarifai: Enables GPT-4 LLM integration.

BeautifulSoup: Extracts NASA Technical Bulletin links | Fetches and segments PDFs.

Langchain: PromptTemplate for LLMs | Links document retrieval, vectorization, and LLM interaction.

HuggingFace & Pinecone: Transforms text to vectors and stores in Pinecone vector database for quick retrieval.

GPT-4: Analyses scientific text, fills missing data, removes irrelevant data, formulate reports.

Streamlit: Offers interactive UI for querying NASA documents on a web app.


https://louisljz-nasa-space-apps.streamlit.app/


It is an AI Solution to analyze Technical/scientific documents accurately and efficiently. Methodology includes:

  • Semantic vector embeddings
  • Standardized data representation.
  • Threshold-based SIMILARITY CHECK
  • Efficient chunk-based processing.
  • A complex time-consuming challenge dealt intelligently with LLM technology.


Explore STARSCAN.AI's AI Pipeline

Copyright © 2025 STARSCAN.AI - All Rights Reserved.

Powered by GoDaddy

This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

Accept