# Stock-Market-Prediction This repository began as a **7th-semester minor project** and evolved into our **8th-semester major project**, **"Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis."** It utilizes **Python, NLP (NLTK, spaCy), ML models, Grafana, InfluxDB, and Streamlit** for data analysis and visualization.

Repo Size GitHub Stars GitHub Forks GitHub Issues Closed Issues Open Pull Requests Closed Pull Requests GitHub Discussions GitHub Contributors Top Language License Last Commit Repository Age Workflow Status GitHub Watchers

![πŸ“ˆ Stock Market Illustration](https://github.com/user-attachments/assets/f5751f74-43c5-4045-aa9f-bb7abd19c1aa)

πŸ’‘ Real Time Prediction

## Project Description The **Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis** project involves a comprehensive approach to predicting stock prices using both numerical data and textual analysis. The project components include: 1. **Data Collection and Storage**: We gathered historical stock data of major companies and stored it in an InfluxDB database to efficiently handle large-scale time-series data. 2. **Data Visualization**: A Grafana dashboard has been set up for real-time visualization of stock prices and analysis results, enhancing data interpretation and decision-making processes. 3. **Textual Analysis for Enhanced Forecasting**: We utilized Natural Language Processing (NLP) libraries, such as NLTK and spaCy, to analyze financial news and reports. This component complements numerical analysis to improve the accuracy of our hybrid forecasting model. 4. **Machine Learning Models**: The project used models including Naive Bayes, MLP (Multi-Layer Perceptron), Logistic Regression, and Random Forest to process both numerical and textual data, creating a robust and comprehensive stock prediction system. 5. **Reddit Chatbot Data Visualization Integration**: The project involved adding static and interactive plots to represent chatbot data from Reddit, using Matplotlib and Seaborn to visualize user interactions, message frequency, and topic distribution effectively. 6. **Collaboration and Project Management**: The repository includes contributions from all team members with well-organized tasks, ensuring seamless collaboration and effective version control. ## Directory Structure ```markdown πŸ“ Stock-Market-Prediction/ β”œβ”€β”€ πŸ“ Codes/ # 🧠 Core code modules for analysis and app development β”‚ β”œβ”€β”€ πŸ“ Historical_Data_Analysis/ # πŸ“Š Scripts for long-term market trend analysis β”‚ β”œβ”€β”€ πŸ“ Partial_Data_Analysis/ # πŸ“‰ Focused short-term or segmented data analysis β”‚ β”œβ”€β”€ πŸ“ Ticker_Symbols_Stocks/ # πŸ’Ή Scripts for retrieving and managing ticker symbols β”‚ β”œβ”€β”€ πŸ“ Flask_App/ # 🌐 Web interface built using Flask framework β”‚ β”œβ”€β”€ πŸ“ Conferences/ # 🎀 Presentation and academic conference materials β”‚ β”œβ”€β”€ πŸ“ Documents/ # πŸ“š Miscellaneous reports and supporting documents β”‚ β”œβ”€β”€ πŸ“ Major_Project/ # πŸ† Final-year major project resources β”‚ β”‚ β”œβ”€β”€ πŸ“ PPT/ # πŸ“½οΈ Presentation slides for the major project β”‚ β”‚ β”œβ”€β”€ πŸ“ Proforma_&_Progress_Report/ # πŸ“ Official progress reports and planning forms β”‚ β”‚ └── πŸ“ Thesis/ # πŸ“– Final thesis document with research and results β”‚ β”‚ β”‚ └── πŸ“ Minor_Project/ # 🎯 Minor-project materials β”‚ β”œβ”€β”€ πŸ“ PPT/ # 🧾 Slides prepared for minor project presentation β”‚ β”œβ”€β”€ πŸ“ Proforma_&_Progress_Report/ # πŸ“‹ Progress reports and planning forms for minor project β”‚ └── πŸ“ Thesis/ # πŸ“˜ Final minor project report or thesis β”‚ β”œβ”€β”€ πŸ“ Reference_Documents/ # πŸ” Research papers and helpful external references β”‚ β”œβ”€β”€ πŸ“ Resources/ # πŸ› οΈ Datasets, libraries, and supporting tools β”‚ β”œβ”€β”€ πŸ“„ LICENSE # πŸ“„ Terms and conditions for usage and distribution └── πŸ“„ README.md # πŸ“˜ Overview, setup guide, and project introduction ``` ### πŸ“„ Thesis Reports > πŸ“– Major & Minor Project Reports > Detailed thesis reports for both major and minor projects are available under their respective [`Thesis`](Documents/Thesis/) folders in [`Documents/`](Documents/). ### πŸ—ƒοΈ InfluxDB Setup Guide > πŸ“– Time-Series Data Storage & Integration > Step-by-step InfluxDB setup and data integration guide is available at [`Codes/Historical_Data_Analysis/InfluxDB/`](Codes/Historical_Data_Analysis/InfluxDB/). ### πŸ“Š Grafana Dashboard Guide > πŸ“– Visualization Dashboard Setup > Grafana dashboard setup and InfluxDB connection guide is available at [`Codes/Historical_Data_Analysis/Grafana_Dashboard/`](Codes/Historical_Data_Analysis/Grafana_Dashboard/). --- ## Dataset Used | Company | Description | Data Range | Dataset Shape | Starting Stock Date | Current Stock Date | Starting Stock Price | Current Stock Price | |-----------------------------------|-------------------------------------------------------------------------------------------------|----------------------|---------------|---------------------|--------------------|----------------------|----------------------| |
Alphabet Inc. (Google) [GOOG] | Specializes in Internet-related services and products, including search engines, online advertising, and cloud computing. | 2014-03-27 : 2024-10-17 | (2659, 5) | 2014-03-27 | 2024-10-17 | $27.8542 | $164.51 | |
Amazon.com Inc. [AMZN] | Started as an online bookstore, now a leader in e-commerce and cloud computing through AWS. | 1997-05-16 : 2024-10-17 | (6901, 5) | 1997-05-16 | 2024-10-17 | $0.0863 | $187.53 | |
Apple Inc. [AAPL] | Renowned for innovative consumer electronics, software, and services, including the iPhone and Mac computers. | 1980-12-12 : 2024-10-17 | (11053, 5) | 1980-12-12 | 2024-10-17 | $0.0992 | $232.15 | |
Meta Platforms [META] | Owner of Facebook, a global leader in social media and digital advertising. | 2012-05-18 : 2024-10-17 | (3124, 5) | 2012-05-18 | 2024-10-17 | $38.1174 | $576.93 | |
Microsoft Corp. [MSFT] | A leading developer of software, consumer electronics, and personal computers. | 1986-03-13 : 2024-10-17 | (9728, 5) | 1986-03-13 | 2024-10-17 | $0.0603 | $416.72 | |
Netflix Inc. [NFLX] | Global streaming entertainment service with a vast library of TV series and films. | 2002-05-23 : 2024-10-17 | (5640, 5) | 2002-05-23 | 2024-10-17 | $1.1964 | $687.65 | |
Nvidia Corp. [NVDA] | Specializes in graphics processing units and AI technology. | 1999-01-22 : 2024-10-17 | (6477, 5) | 1999-01-22 | 2024-10-17 | $0.0377 | $136.93 | |
Tata Consultancy Services [TCS] | Leading global IT services, consulting, and business solutions provider. | 2013-11-01 : 2024-10-17 | (2758, 5) | 2013-11-01 | 2024-10-17 | $543.0 | $11.8 | --- ## Tools and Technologies 1. **Python**: Core programming language used for data analysis, model building, and backend development. 2. **GitHub**: Platform for version control and collaborative development. 3. **InfluxDB**: Database for efficient time-series data storage and retrieval. 4. **Grafana**: Tool for real-time data visualization and dashboard creation. 5. **Streamlit**: Framework for creating interactive web applications. 6. **Flask**: Lightweight framework for developing the project’s backend. 7. **Pandas**: Library for data manipulation and analysis. 8. **Matplotlib & Plotly**: Libraries for data visualization and graphical representation. 9. **NLP Libraries (NLTK, spaCy)**: Tools for processing and analyzing text data. 10. **Machine Learning Libraries**: Used for implementing models like Naive Bayes, MLP, Logistic Regression, and Random Forest. --- ## Project Database & Dashboard For easy visualization and data management, we are using the following tools: ### InfluxDB Database ### 1. Numerical Analysis ![Numerical Analysis Snapshot](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/InfluxDB_Database/Numerical_Analysis_1.png) ### 2. Model Prediction ![Model Prediction Snapshot](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/InfluxDB_Database/Model_Prediction_1.png) ### 3. Textual Analysis ![Textual Analysis Snapshot](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/InfluxDB_Database/Textual_Analysis_1.png) ### 4. Hybrid Model ![Hybrid Model Snapshot](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/InfluxDB_Database/Hybrid_Model_1.png) ` ### Grafana Dashboard ### 1. Stock Market and Project Dashboard Overview This section provides an overview of the stock market, project details, and descriptions of the companies used in the project, including MAANG, Nvidia, Microsoft, and TCS. ![Stock Market Overview Snapshot 1](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Stock_Market_and_Project_Dashboard_Overview/Dashboard_1.png) ![Stock Market Overview Snapshot 2](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Stock_Market_and_Project_Dashboard_Overview/Dashboard_Panel_Dashboard_Description_and_Datasets_Information_View_1.png) ![Stock Market Overview Snapshot 3](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Stock_Market_and_Project_Dashboard_Overview/Dashboard_Panel_Dashboard_Description_and_Datasets_Information_View_2.png) ![Stock Market Overview Snapshot 4](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Stock_Market_and_Project_Dashboard_Overview/Dashboard_Panel_Stock_Market_View_1.png) ### 2. Numerical Analysis This section displays numerical data of the stock market, featuring graphs of open, high, low, and close prices along with volume bar plots, RSI, and moving averages. ![Numerical Analysis Snapshot 1](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Numerical_Analysis/Dashboard_1.png) ![Numerical Analysis Snapshot 2](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Numerical_Analysis/Dashboard_2.png) ![Numerical Analysis Snapshot 3](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Numerical_Analysis/Dashboard_3.png) ### 3. Model Prediction This section highlights model predictions, including individual and comparative graphs of predicted and actual values for stock prices, as well as predicted RSI and moving averages. ![Model Prediction Snapshot 1](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Model_Prediction/Dashboard_1.png) ![Model Prediction Snapshot 2](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Model_Prediction/Dashboard_2.png) ![Model Prediction Snapshot 3](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Model_Prediction/Dashboard_Panel_Model_Evaluation_View.png) ### 4. Textual Analysis This section visualizes sentiment analysis from news headlines, showcasing positive, negative, and neutral sentiment scores. ![Textual Analysis Snapshot 1](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Textual_Analysis/Dashboard_1.png) ![Textual Analysis Snapshot 2](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Textual_Analysis/Dashboard_2.png) ### 5. Hybrid Model The hybrid model combines numerical and textual data for a comprehensive analysis. ![Hybrid Model Snapshot 1](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Hybrid_Model/Dashboard_1.png) ![Hybrid Model Snapshot 2](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Hybrid_Model/Dashboard_2.png) ![Hybrid Model Snapshot 3](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Hybrid_Model/Dashboard_3.png) ![Hybrid Model Snapshot 4](https://github.com/madhurimarawat/Stock-Market-Prediction/raw/main/Codes/Historical_Data_Analysis/Database_Dashboard_Snapshots/Grafana_Dashboard/Hybrid_Model/Dashboard_4.png) --- ### Project Deployment 1. **Streamlit:** The app is deployed at:   [Stock Market Numerical and Text Hybrid Prediction](https://stock-market-numerical-text-hybrid-prediction.streamlit.app/) Here's an overview of the Streamlit App: Streamlit App View 1 Streamlit App View 2 Streamlit App View 3 3. **Flask:** The app codes can be seen here: [Flask App Codes](https://github.com/madhurimarawat/Stock-Market-Prediction/tree/main/Codes/Flask_App). Here's an overview of the Flask App: ![Flask App GIF](https://github.com/user-attachments/assets/86f149d3-d746-4008-8a0c-6dfd6f539f54) --- ## Meet the Team
Madhurima Rawat Geetanshu Dev Meshram Sneha Jha
Role
Role
Project Planner & Developer Data Analyst & Backend Developer Data Analyst
Responsibilities
Responsibilities
Project planning, managing GitHub repo, docs, InfluxDB setup, Grafana dashboard, Streamlit & Flask, data viz & preprocessing Model building for numerical data, Flask app design Text data processing, model building, hybrid model creation
Tools
Tools
GitHub, InfluxDB, Grafana, Streamlit, Python, Flask, Pandas, Matplotlib, Plotly Python, Flask, ML libraries NLP libraries, ML libraries, hybrid modeling tools
GitHub
GitHub
GitHub GitHub GitHub
LinkedIn
LinkedIn
LinkedIn LinkedIn LinkedIn
Email
Email
Email Email Email
--- ## Resources 1. **Partial Data Analysis**: - Historical Stock Prices: [Yahoo Finance](https://finance.yahoo.com/) - **Textual and Hybrid Data:** - [Kaggle Dataset - News & Stock Prices](https://www.kaggle.com/datasets/kianso/news-stock-price) 2. **Complete Historical Data**: - **Alphabet (Google) (GOOG)**: [Google Stock Price](https://www.macrotrends.net/stocks/charts/GOOG/google/stock-price-history) - **Apple (AAPL)**: [Apple Stock Price](https://www.macrotrends.net/stocks/charts/AAPL/apple/stock-price-history) - **Amazon (AMZN)**: [Amazon Stock Price](https://www.macrotrends.net/stocks/charts/TCS/container-store/stock-price-history) - **Meta (META)**: [Meta Stock Price](https://www.macrotrends.net/stocks/charts/META/meta-platforms/stock-price-history) - **Netflix (NFLX)**: [Netflix Stock Price](https://www.macrotrends.net/stocks/charts/NFLX/netflix/stock-price-history) - **Nvidia (NVDA)**: [Nvidia Stock Price](https://www.macrotrends.net/stocks/charts/NVDA/nvidia/stock-price-history) - **Microsoft (MSFT)**: [Microsoft Stock Price](https://www.macrotrends.net/stocks/charts/MSFT/microsoft/stock-price-history) - **TCS**: [TCS Stock Price](https://www.macrotrends.net/stocks/charts/TCS/container-store/stock-price-history) 3. **Illustration Links:** - [Project Resources (Illustration 1)](https://img.freepik.com/premium-vector/flat-design-stock-market-analysis_23-2148590818.jpg) - [Streamlit App Background Image (Illustration 2)](https://vectormine.b-cdn.net/wp-content/uploads/Stock_Market.jpg) --- ## Thanks for Visiting πŸ˜„ - Drop a 🌟 if you find this repository useful.

- If you have any doubts or suggestions, feel free to reach us.

- **Contribute and Discuss:** Feel free to open issues πŸ›, submit pull requests πŸ› οΈ, or start discussions πŸ’¬ to help improve this repository!