This AI-Guided Tutorial is designed to familiarizing them with fundamental Python packages used in scientific computing and data analysis.


AI-Guided Tutorial: SciPy, Statsmodels, and Matplotlib Fundamentals

Learn how to use powerful Python libraries for scientific computing, statistical modeling, and data visualization through interactive AI guidance.

Learning Objectives

By the end of this tutorial, you should understand:

Tutorial Instructions

Part 1: Setup and Package Installation Check
  1. Open VS Code and create scientific_computing_tutorial.py.
  2. Ask Copilot Chat: “Write Python code that checks if the SciPy, statsmodels, and matplotlib packages are installed.”
  3. Run the code and report the outcome here. If any package is missing, use Copilot Chat to find the necessary installation command (e.g., using pip).
Part 2: Basic Matplotlib Plotting

Ask Copilot Chat: How do I import the matplotlib.pyplot module and create a line graph showing two distinct curves (like a sine and cosine wave) on the same plot? Show me how to define axis limits and add a legend.

Practice Task:

  1. Use the line import matplotlib.pyplot as plt.
  2. Create data points for a sine wave and a cosine wave.
  3. Plot both curves on the same figure, ensuring the axes are labeled and a title is included.
  4. Add a legend to distinguish the two curves.
Part 3: Data Generation with SciPy

Ask Copilot Chat: How can I use the scipy.stats submodule to generate 100 random data points following a normal distribution? Also, show me how to calculate the skewness and kurtosis of this generated data using SciPy’s statistical functions.

Practice Task:

  1. Generate a list or array of 100 random numbers using a normal distribution function from SciPy.
  2. Calculate and print the mean, median, and variance of the generated dataset using SciPy functions.
Part 4: Simple Statistical Modeling with Statsmodels

Ask Copilot Chat: How do I use the statsmodels.api library to perform a simple Ordinary Least Squares (OLS) regression? Show me how to define synthetic independent (X) and dependent (Y) variables, fit the model, and display the summary.

Practice Task:

  1. Define a synthetic dataset where Y is approximately equal to 2 * X plus some random noise.
  2. Fit an OLS model using statsmodels to determine the relationship between your synthetic X and Y variables.
  3. Print the summary output of the OLS model.
Part 5: Combining Packages (Visualization of Statistical Results)

Ask Copilot Chat: How can I plot the actual data points (using matplotlib scatter plot) and overlay the best-fit regression line generated from a statsmodels OLS result?

Practice Task:

  1. Using the data and OLS model from Part 4, create a figure.
  2. Plot the synthetic X and Y data points as a scatter plot.
  3. Calculate the predicted Y values from your OLS model and overlay these predictions as a line graph, distinguishing it clearly from the scatter points (e.g., using a different color).