By the end of this module, students will be able to:
Be able to decompose a time series into seasonal, trend, and residual components to better understand patterns in data collected from sensors
Understand different types of noise in sensor measurements and how noise propagates through both direct and virtual measurements
Be able to work with virtual sensors and understand how they derive measurements from other sensors or models
Recognize the importance of understanding measurement uncertainty and error in sensor-based systems
3.1.2 Topics Covered
Time series decomposition
Noise characterization
Virtual sensors
Data quality
3.1.3 Project Milestones
Identify your teammates for the group project (groups of 1-3 people).
3.2 Source Material
3.2.1 Introduction to Signals
In civil and environmental engineering, we often collect data from sensors that measure physical quantities over time. These measurements form what we call signals.
3.2.1.1 Continuous-Time Signals
A continuous-time signal is a function that maps time to a measured value. We denote such a signal as:
\[x(t), \quad t \in \mathbb{R}\]
where \(t\) represents continuous time and \(x(t)\) represents the value of the signal at time \(t\). For example, the temperature measured by a sensor at any instant in time could be represented as \(T(t)\).
3.2.1.2 Discrete-Time Signals
In practice, sensors sample physical quantities at discrete time intervals. A discrete-time signal is represented as:
\[x[n], \quad n \in \mathbb{Z}\]
or equivalently:
\[x_n, \quad n = 0, 1, 2, \ldots\]
where \(n\) is the sample index and \(x_n\) represents the value of the signal at the \(n\)-th sample. The relationship between continuous and discrete time is given by:
\[x[n] = x(t_n) = x(n \cdot \Delta t)\]
where \(\Delta t\) is the sampling interval. For instance, if we sample temperature every 5 minutes, then \(\Delta t = 5\) minutes and \(T_n\) represents the temperature at the \(n\)-th five-minute interval.
3.2.2 Signal Decomposition
Time series data from sensors often contain multiple underlying patterns. A common and useful approach is to decompose a signal into additive components that capture different temporal characteristics.
3.2.2.1 Additive Decomposition Model
We can express a continuous-time signal \(x(t)\) as the sum of multiple components:
Seasonal component\(x_{\text{seasonal}}(t)\): Captures periodic patterns that repeat over fixed intervals (e.g., daily cycles, annual cycles). This can be modeled as a moving average over a long temporal window (e.g., annual timescale).
Trend component\(x_{\text{trend}}(t)\): Captures medium-term trends and variations (e.g., day-of-year patterns, seasonal weather changes). This can be modeled as a moving average over an intermediate temporal window.
Residual component\(x_{\text{residual}}(t)\): Captures short-term fluctuations and noise that remain after removing the seasonal and trend components. Instead of modling this term explicitly (i.e., deriving it from the measurements through some function like we did for the trend and seasonal components) we instead directly define it as the residual of the signal: whatever is left after removing seasonal and trend components.
Each component can be thought of as a moving average with a different window size:
where \(W\) is the window length appropriate for that component.
This decomposition helps us understand and isolate different sources of variation in sensor data, making it easier to identify patterns, detect anomalies, and build predictive models.
3.2.3 Noise in Sensor Measurements
Sensor measurements are never perfect. They are affected by various sources of noise and uncertainty that we must characterize and account for.
3.2.3.1 The Additive Noise Model
We model the relationship between a measurement, the true physical quantity, and noise as:
\[y_i = q_i + n_i\]
where:
\(y_i\) is the measured value at discrete time index \(i\)
\(q_i\) is the true physical quantity we are trying to measure
\(n_i\) is the measurement noise
The noise \(n_i\) represents all sources of error: sensor imprecision, environmental interference, quantization errors, etc.
3.2.3.2 Statistical Characterization of Noise
Since noise is random, we characterize it using probability distributions. In general, the noise follows some probability density function (PDF):
\[f_N(n)\]
which describes the likelihood of observing different noise values.
3.2.3.2.1 Gaussian (Normal) Noise
A common assumption is that measurement noise follows a Gaussian distribution:
where \(\mu\) is the mean and \(\sigma^2\) is the variance of the noise. For many sensors, we assume zero-mean noise (\(\mu = 0\)), meaning the noise doesn’t systematically bias measurements in one direction.
3.2.3.2.2 Variance of the Mean
When we have multiple independent measurements \(y_1, y_2, \ldots, y_N\) of the same quantity, we often compute their average (called the sample mean):
\[\bar{y} = \frac{1}{N}\sum_{i=1}^{N} y_i\]
If the noise is independent and identically distributed (i.i.d.) with variance \(\sigma^2\), the variance of the mean is:
\[\text{Var}(\bar{y}) = \frac{\sigma^2}{N}\]
Tip
Active Learning
You may want to prove that the variance of the sample mean is indeed \(\frac{\sigma^2}{N}\). There is a sketch of the proof in the lecture slides.
This shows that averaging \(N\) measurements reduces the variance (and thus uncertainty) by a factor of \(N\). The standard deviation decreases by a factor of \(\sqrt{N}\).
Example: Suppose a temperature sensor has noise with standard deviation \(\sigma = 0.5°C\). If we take a single measurement, the uncertainty is \(0.5°C\). If we average 100 measurements, the uncertainty of the average becomes:
This tenfold reduction in uncertainty illustrates the power of averaging to improve measurement quality.
3.2.4 Virtual Sensors and Uncertainty Propagation
In many engineering applications, we don’t directly measure the quantity of interest. Instead, we derive it from other measurements using mathematical relationships.
3.2.4.1 What is a Virtual Sensor?
A virtual sensor is a measurement derived from one or more physical sensor measurements through a mathematical function. For example:
Average temperature across multiple rooms: \(T_{\text{avg}} = \frac{1}{3}(T_1 + T_2 + T_3)\)
Total energy consumption: \(E_{\text{total}} = E_1 + E_2 + E_3\)
Heat flux from temperature difference: \(q = k \cdot \Delta T\)
These derived quantities are “virtual” measurements that carry uncertainty from the original measurements.
3.2.4.2 Uncertainty Propagation for Linear Combinations
Consider a virtual measurement that is a linear combination of \(M\) sensor measurements:
\[z = \sum_{j=1}^{M} a_j y_j\]
where \(a_j\) are constants and \(y_j\) are the individual measurements. Each measurement has the form \(y_j = q_j + n_j\).
If we assume the noise terms \(n_j\) are independent and identically distributed with zero mean and variance \(\sigma^2\), the variance of the virtual measurement is:
This confirms our earlier result: averaging reduces variance by a factor of \(M\).
Understanding how uncertainty propagates through virtual measurements is critical for:
Assessing the reliability of derived quantities
Designing sensor networks with appropriate redundancy
Making informed decisions based on uncertain data
3.3 Learn-by-Doing Activities
Note
The following activity needs to be completed in an interactive version of this document (i.e, the HTML output served by a webserver).
3.4 Interactive Exploration: Virtual Sensors and Uncertainty
This interactive widget allows you to explore how uncertainty in individual sensor measurements affects a virtual measurement (the average of multiple sensors).
The scenario: Three temperature sensors measure the temperature in three different rooms of a building. Measurements are taken every 5 minutes for one full day (288 samples). Each sensor measures the true room temperature plus some random noise. We create a virtual sensor that computes the average temperature across all three rooms.
Use the sliders below to adjust the true temperature (mean) and measurement noise (standard deviation) for each room, and observe how these changes affect both the individual measurements and the average temperature.
Plot =import("https://cdn.jsdelivr.net/npm/@observablehq/plot@0.6.0/+esm")d3 =import("https://cdn.jsdelivr.net/npm/d3@7/+esm")// Number of samples (1 day at 5-minute intervals = 288 samples)samplesPerDay =288// Time array (in hours)timeHours =Array.from({length: samplesPerDay}, (_, i) => i *5/60)// Generate Gaussian random numbers using Box-Muller transformfunctionrandn() {let u =0, v =0;while(u ===0) u =Math.random();while(v ===0) v =Math.random();returnMath.sqrt(-2.0*Math.log(u)) *Math.cos(2.0*Math.PI* v);}// Function to generate temperature time series for a roomfunctiongenerateRoomTemperature(mean, stddev, n) {returnArray.from({length: n}, () => mean + stddev *randn());}// Sliders for Room 1viewof room1Mean = Inputs.range([15,25], {value:20,step:0.5,label:"Room 1 Mean Temperature (°C)"})viewof room1Std = Inputs.range([0,2], {value:0.5,step:0.1,label:"Room 1 Noise Std Dev (°C)"})// Sliders for Room 2viewof room2Mean = Inputs.range([15,25], {value:21,step:0.5,label:"Room 2 Mean Temperature (°C)"})viewof room2Std = Inputs.range([0,2], {value:0.5,step:0.1,label:"Room 2 Noise Std Dev (°C)"})// Sliders for Room 3viewof room3Mean = Inputs.range([15,25], {value:19,step:0.5,label:"Room 3 Mean Temperature (°C)"})viewof room3Std = Inputs.range([0,2], {value:0.5,step:0.1,label:"Room 3 Noise Std Dev (°C)"})// Button to regenerate noiseviewof regenerate = Inputs.button("Regenerate Noise", {value:0,reduce: (v) => v +1})// Generate temperature data for each roomroom1Temps = { regenerate;// This makes the cell reactive to button clicksreturngenerateRoomTemperature(room1Mean, room1Std, samplesPerDay);}room2Temps = { regenerate;returngenerateRoomTemperature(room2Mean, room2Std, samplesPerDay);}room3Temps = { regenerate;returngenerateRoomTemperature(room3Mean, room3Std, samplesPerDay);}// Compute average (virtual sensor)averageTemps = room1Temps.map((v1, i) => (v1 + room2Temps[i] + room3Temps[i]) /3)// Prepare data for plottingplotData = timeHours.flatMap((t, i) => [ {time: t,temperature: room1Temps[i],sensor:"Room 1"}, {time: t,temperature: room2Temps[i],sensor:"Room 2"}, {time: t,temperature: room3Temps[i],sensor:"Room 3"}, {time: t,temperature: averageTemps[i],sensor:"Average (Virtual)"}])// Create the plotPlot.plot({width:900,height:400,marginLeft:60,marginRight:100,x: {label:"Time (hours)",grid:true },y: {label:"Temperature (°C)",grid:true },color: {domain: ["Room 1","Room 2","Room 3","Average (Virtual)"],range: ["steelblue","orange","green","red"],legend:true },marks: [ Plot.line(plotData, {x:"time",y:"temperature",stroke:"sensor",strokeWidth: d => d.sensor==="Average (Virtual)"?2.5:1.5,opacity: d => d.sensor==="Average (Virtual)"?1:0.7 }) ]})
md`### Observed Statistics**Individual Sensors:**- Room 1: Mean = ${d3.mean(room1Temps).toFixed(2)}°C, Std Dev = ${d3.deviation(room1Temps).toFixed(2)}°C- Room 2: Mean = ${d3.mean(room2Temps).toFixed(2)}°C, Std Dev = ${d3.deviation(room2Temps).toFixed(2)}°C- Room 3: Mean = ${d3.mean(room3Temps).toFixed(2)}°C, Std Dev = ${d3.deviation(room3Temps).toFixed(2)}°C**Virtual Sensor (Average):**- Mean = ${d3.mean(averageTemps).toFixed(2)}°C- Std Dev = ${d3.deviation(averageTemps).toFixed(2)}°C**Theoretical Prediction:**- If all rooms had the same noise level σ, the average should have std dev ≈ σ/√3 = ${(Math.max(room1Std, room2Std, room3Std) /Math.sqrt(3)).toFixed(2)}°C- Actual average std dev: ${d3.deviation(averageTemps).toFixed(2)}°C`
3.4.1 Questions to Consider
Effect of noise: Try setting all three rooms to the same mean temperature but different noise levels. How does the averaging reduce the noise in the virtual sensor?
Variance reduction: Set all three rooms to have the same noise standard deviation (e.g., 1.0°C). Compare the standard deviation of the individual rooms to the standard deviation of the average. Is it close to the theoretical prediction of σ/√3?
Different means: Set the three rooms to have different mean temperatures. What is the mean of the average temperature? How does it relate to the individual room temperatures?
High vs. low noise: Compare the case where all rooms have low noise (std dev = 0.1°C) versus high noise (std dev = 2.0°C). Click “Regenerate Noise” several times for each case. How much does the average temperature fluctuate between regenerations?