<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Slides | Steefan Contractor</title><link>https://steefancontractor.github.io/slides/</link><atom:link href="https://steefancontractor.github.io/slides/index.xml" rel="self" type="application/rss+xml"/><description>Slides</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Thu, 11 Jul 2024 11:30:37 +1000</lastBuildDate><item><title>Aceas Multiage Seaice Classification</title><link>https://steefancontractor.github.io/slides/aceas-multiage-seaice-classification/</link><pubDate>Thu, 11 Jul 2024 11:30:37 +1000</pubDate><guid>https://steefancontractor.github.io/slides/aceas-multiage-seaice-classification/</guid><description>&lt;link rel="stylesheet" href="reveal_custom.css">
&lt;link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
&lt;div class="vert-banner">
&lt;img src="./img/UNSW_logo-portrait-light_transparent.png">
&lt;img src="https://antarctic.org.au/wp-content/uploads/2021/10/ACEAS-Logo-Concept-1-WHT.png" style="height:12vh">
&lt;img src="./img/Udash logo CMYK revised-03.png">
&lt;/div>
&lt;script>
function add_vert_banner() {
let vertbanner = document.querySelector("div.vert-banner");
let reveal = document.querySelector(".reveal");
reveal.insertBefore(vertbanner, reveal.firstChild);
}
window.onload = add_vert_banner();
&lt;/script>
&lt;h1 id="bayesian-updates-to-multi-age-antarctic-sea-ice-concentrations-using-gnss-r-data">Bayesian updates to multi-age Antarctic sea-ice concentrations using GNSS-R data&lt;/h1>
&lt;p>Speaker: Steefan Contractor&lt;/p>
&lt;p>Coauthors: Shane Keating (UNSW), Jessica Cartwright (Spire Global), Alex Fraser (UTAS)&lt;/p>
&lt;aside class="notes">
&lt;p>Key points:&lt;/p>
&lt;ul>
&lt;li>Coauthors in no particular order&lt;/li>
&lt;li>why updating as opposed to straight up prediction?&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1549598685-0058b114c9d6?q=80&amp;w=3078&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.3>
&lt;h2 id="sea-ice">Sea Ice&lt;/h2>
&lt;ul>
&lt;li>Floating ice that forms from the freezing of seawater&lt;/li>
&lt;li>Melting and formation of sea ice affects the ocean salinity and heat content&lt;/li>
&lt;li>The changes in ocean density and temperature affect the ocean circulation as evidenced by recent coverage on AMOC weakening&lt;/li>
&lt;li>It affects the Earth&amp;rsquo;s energy balance by reflecting ten times more sunlight compared to water&lt;/li>
&lt;li>It acts like a blanket affecting not just heat exchange between the ocean and atmosphere but also gases&lt;/li>
&lt;li>Through the changes in polar air masses it also affects atmospheric circulation&lt;/li>
&lt;/ul>
&lt;aside class="notes">
&lt;ul>
&lt;li>don&amp;rsquo;t spend too much time on context&lt;/li>
&lt;li>audience knows importance of sea ice&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;/section>
&lt;hr>
&lt;section data-background-image="./img/sea-ice-types.png" data-background-opacity=1.0 data-background-size=75%>
&lt;aside class="notes">
&lt;ul>
&lt;li>key way we distinguish sea ice types is by age/thickness&lt;/li>
&lt;li>very new ice has structurual differences
&lt;ul>
&lt;li>such as sharp needles in Frazil ice etc.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>but in this study we focus on YI, FYI and MYI&lt;/li>
&lt;li>which differ in their salinity, thickness and roughness&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;hr>
&lt;p>Image sources:&lt;/p>
&lt;ul>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.researchgate.net/publication/353599050_Novel_applications_of_GNSS-R_data_from_TechDemoSat-1_to_monitoring_the_cryosphere" target="_blank" rel="noopener">Novel application of GNSS-R data from TechDemoSat-1 to monitoring the cryosphere, Jessica Cartwright&lt;/a>&lt;/span>&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.ccin.ca/ccw/seaice/overview/types" target="_blank" rel="noopener">https://www.ccin.ca/ccw/seaice/overview/types&lt;/a>&lt;/li>
&lt;li>&lt;span style="font-size: small;">Australian Antarctic Program
&lt;ul>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.antarctica.gov.au/site/assets/files/47686/rs13801_rich-youd-2014-04-07-2x1a8995-lpr.1024x0.jpg" target="_blank" rel="noopener">https://www.antarctica.gov.au/site/assets/files/47686/rs13801_rich-youd-2014-04-07-2x1a8995-lpr.1024x0.jpg&lt;/a>&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.antarctica.gov.au/site/assets/files/47686/mawson-grease-ice-2016-jennifer-wressell.1200x0.jpg" target="_blank" rel="noopener">https://www.antarctica.gov.au/site/assets/files/47686/mawson-grease-ice-2016-jennifer-wressell.1200x0.jpg&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.sciencefriday.com/wp-content/uploads/2017/02/fce480147c3334975973a38782f1382c-min.jpg" target="_blank" rel="noopener">https://www.sciencefriday.com/wp-content/uploads/2017/02/fce480147c3334975973a38782f1382c-min.jpg&lt;/a>&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://www.youtube.com/watch?v=_iTBQiE2CuM" target="_blank" rel="noopener">https://www.youtube.com/watch?v=_iTBQiE2CuM&lt;/a>&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://johnenglander.net/wp/wp-content/uploads/2018/05/young-sea-ice.jpeg" target="_blank" rel="noopener">https://johnenglander.net/wp/wp-content/uploads/2018/05/young-sea-ice.jpeg&lt;/a>&lt;/li>
&lt;li>&lt;span style="font-size: small;">&lt;a href="https://library.wmo.int/records/item/41953-wmo-sea-ice-nomenclature" target="_blank" rel="noopener">WMO Sea Ice Nomenclature WMO-No. 259&lt;/a>&lt;/li>
&lt;/ul>
&lt;aside class="notes">
&lt;ul>
&lt;li>references can be found in a vertical slide&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;/section>
&lt;hr>
&lt;span class="fragment " >
&lt;p>&lt;strong>Young Ice (YI)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Newly formed ice&lt;/li>
&lt;li>can be rough or smooth&lt;/li>
&lt;li>less than 30cm thick&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;p>&lt;strong>First-year Ice (FYI)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Ice that has survived one summer melt season&lt;/li>
&lt;li>can be level, rough or have ridges&lt;/li>
&lt;li>30cm to 2m thick&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;p>&lt;strong>Multi-year Ice (MYI)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Ice that has survived more than one summer melt season&lt;/li>
&lt;li>typically smoother than FYI&lt;/li>
&lt;li>Over 2.5m thick and hence protrudes above the waterline&lt;/li>
&lt;li>has extremely low salinity compared to YI and FYI&lt;/li>
&lt;/ul>
&lt;/span>
&lt;aside class="notes">
&lt;p>&lt;strong>Add a slide after this showing what these 3 types of ice look like in the dataset we will be using&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>thickness of YI, FYI and MYI&lt;/li>
&lt;li>feezing ice rejects salt but it gets trapped in the crystals and slowly precipitates out over time&lt;/li>
&lt;li>this is why older ice has lower salinity&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;hr>
&lt;section data-background-image="./img/AdobeStock_587380910.jpeg" data-background-opacity=0.3>
&lt;h2 id="remote-sensing-of-sea-ice">Remote Sensing of Sea Ice&lt;/h2>
&lt;span class="fragment " >
&lt;p>Active sensors&lt;/p>
&lt;ul>
&lt;li>Radar altimeters (CrysoSat-2)&lt;/li>
&lt;li>Laser altimeters (IceSat-2)&lt;/li>
&lt;li>Scatterometers (Metop A/B/C - ASCAT)&lt;/li>
&lt;li>Synthetic Aperture Radar (Radarsat-2)&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;p>Passive sensors&lt;/p>
&lt;ul>
&lt;li>Passive microwave radiometers (SMOS)&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;p>Hybrid sensors&lt;/p>
&lt;ul>
&lt;li>Global Navigation Satellite System Reflectometry - GNSS-R&lt;/li>
&lt;/ul>
&lt;/span>
&lt;aside class="notes">
&lt;ul>
&lt;li>Active is when we create our own signal&lt;/li>
&lt;li>requires a lot of power and hence cost of construction and operation is high&lt;/li>
&lt;li>Passive in contrast picks up Earth&amp;rsquo;s natural emissions&lt;/li>
&lt;li>Various frequencies of light will pick up difference properties of the surface emitting or reflecting the radiation&lt;/li>
&lt;li>I&amp;rsquo;ve listed key active and passive missions that are used to monitor sea ice over Antarctica&lt;/li>
&lt;li>GNSS-R is a hybrid sensor that uses the reflected signals from GNSS (GPS) satellites that is constantly bombarding the earth&amp;rsquo;s surface,&lt;/li>
&lt;li>typically using cubesats that are low cost and cheap to run&lt;/li>
&lt;li>so it has the advantages of both active and passive sensors&lt;/li>
&lt;li>and more importantly for us, it is a completely independent data source&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;hr>
&lt;p>IUP U. Bremen Multiage Ice Concentration
&lt;img src="./img/U Brem. IUP multiage ice 2020-03-01.png" height=280px>&lt;/p>
&lt;p>GNSS-R&lt;/p>
&lt;img src="./img/gnssr excess phase noise animation.gif" width=300px>
&lt;img src="./img/gnssr phase noise animation.gif" width=300px>
&lt;/section>
&lt;hr>
&lt;h2 id="gnss-r">GNSS-R&lt;/h2>
&lt;section data-background-image="./img/AdobeStock_429767427.jpeg" data-background-opacity=0.5 data-background-size="cover">
&lt;div style="margin-top: 80px;">
&lt;img src="./img/GNSS-R - signal to features schematic.png" style="max-width: 100%;">
&lt;/div>
&lt;aside class="notes">
&lt;ul>
&lt;li>Ok here&amp;rsquo;s how it works: there is a direct and reflected signal&lt;/li>
&lt;li>we aggregate the intensity of each signal into time delay (phase space) and doppler shift (frequency space) bins giving us these DDMs&lt;/li>
&lt;li>from these DDMs we get the various variables that we will use as features to classify the ice types
&lt;ul>
&lt;li>power: integrating over the DDM space we get the power&lt;/li>
&lt;li>reflectivity: ratio of the reflected to the incident power&lt;/li>
&lt;li>snr: subtract out the noise floor where we don&amp;rsquo;t see a signal and divide it by the noise floor&lt;/li>
&lt;li>phase noise: standard deviation of the phase of the signal&lt;/li>
&lt;li>excess phase noise: std dev of the phase after removing the coherent SNR component of the signal, leaving only the component of phase noise due to the surface roughness&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>These variables were hand picked by Spire&amp;rsquo;s domain experts (Jessica Cartwright) after lots of internal testing and this was the data that was given to us&lt;/li>
&lt;li>Note that the reason we focus on these variables instead of using the entire DDMs is because the dataset is already over 10Gb with only 14 numbers per observation, let alone a whole image per observation.&lt;/li>
&lt;li>There are a lot of samples, the raw data was in TB.&lt;/li>
&lt;/ul>
&lt;/aside>
&lt;/section>
&lt;hr>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;h2 id="iup-multiyear-ice-concentration-and-other-sea-ice-types-version-aq2-antarctic">IUP Multiyear ice concentration and other sea ice types, Version AQ2 (Antarctic)&lt;/h2>
&lt;span class="fragment " >
&lt;ul>
&lt;li>Provides YI, FYI and MYI concentrations&lt;/li>
&lt;li>Developed by Institute of Environmental Physics, University of Bremen&lt;/li>
&lt;li>Uses passive microwave (AMSR2) and scatterometer (ASCAT instruments on Metop A/B/C) data to derive initial estimates&lt;/li>
&lt;li>Corrects the initial estimates using 2m surface air temperature and sea ice drift data&lt;/li>
&lt;li>12.5km x 12.5km grid resolution&lt;/li>
&lt;/ul>
&lt;p class="citation"> Melsheimer, Christian; Spreen, Gunnar; Ye, Yufang; Shokr, Mohammed (2019): Multiyear Ice Concentration, Antarctic, 12.5 km grid, cold seasons 2013-2018 (from satellite). PANGAEA, https://doi.org/10.1594/PANGAEA.909054
&lt;/span>
&lt;/div>
&lt;div class="col1">
&lt;span class="fragment " >
&lt;img src="./img/Original U.Brem non zero ice concentration distribution.png">
&lt;/span>
&lt;/div>
&lt;/div>
&lt;hr>
&lt;div class="multi-column">
&lt;div class="col1" style="margin-top: 100px;">
&lt;h2 id="sea-ice-signal-in-gnss-r-features">Sea Ice Signal in GNSS-R features?&lt;/h2>
&lt;/div>
&lt;div class="col2">
&lt;img src="./img/Histogram of original features min-max-scaled.png">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;section>
&lt;iframe src="https://steefancontractor.github.io/plotly_plots/spacebridge/First%20three%20principal%20components%20of%20features%20coloured%20by%20water-ice%20label.html" width="100%" height="600px" style="border:none;">&lt;/iframe>
&lt;hr>
&lt;p>&lt;strong>Some details&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>PCA data:
&lt;ul>
&lt;li>water - total ice concentration = 0%&lt;/li>
&lt;li>ice - total ice concentration &amp;gt; 99%&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>PCA features:
&lt;ul>
&lt;li>reflectivity1&lt;/li>
&lt;li>snr_reflected1&lt;/li>
&lt;li>power_reflected1&lt;/li>
&lt;li>phase_noise1&lt;/li>
&lt;li>excess_phase_noise1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>total explained variance: 99.25%&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;section>
&lt;p>&lt;embed src="https://steefancontractor.github.io/plotly_plots/spacebridge/First%20three%20principal%20components%20of%20features%20coloured%20by%20ice%20type%20labels.html" width="100%" height="600px" style="border:none;">&lt;/iframe>&lt;/p>
&lt;hr>
&lt;p>&lt;strong>Some details&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>PCA data:
&lt;ul>
&lt;li>YI - YI ice concentration &amp;gt; 90%&lt;/li>
&lt;li>FYI - FYI ice concentration &amp;gt; 99.9%&lt;/li>
&lt;li>MYI - MYI ice concentration &amp;gt; 99%&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>PCA features:
&lt;ul>
&lt;li>reflectivity1&lt;/li>
&lt;li>snr_reflected1&lt;/li>
&lt;li>power_reflected1&lt;/li>
&lt;li>phase_noise1&lt;/li>
&lt;li>excess_phase_noise1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>total explained variance: 99.58%&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;div class="multi-column">
&lt;div class="col1" style="margin-top: 100px">
&lt;h2 id="correlation-amongst-gnss-r-features-and-ice-types">Correlation amongst GNSS-R features and ice types&lt;/h2>
&lt;/div>
&lt;div class="col2">
&lt;img src="./img/Correlation heatmap of engineered features and labels.png" height="640px">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;section>
&lt;h2 id="geographic-distribution">Geographic distribution&lt;/h2>
&lt;img src="./img/Geographic distribution of ALL GNSS-R observations.png">
&lt;hr>
&lt;img src="./img/Geographic distribution of only GNSS-R observations with ice conc >= 80%25.png" height="320px">
&lt;div>
&lt;img src="./img/S_202009_conc_hires_v3.0.png" height="360px">
&lt;img src="./img/S_202002_conc_hires_v3.0.png" height="360px">
&lt;/div>
&lt;hr>
&lt;img src="./img/Antarctic yearly sea ice extent.png" height="500px">
&lt;p class="citation"> National Snow and Ice Data Center, Boulder, Colorado USA. https://nsidc.org/data/seaice_index, last access: 2024-07-16
&lt;p class="citation"> J. C. Comiso, A. C. Bliss, R. Gersten, C. L. Parkinson, and T. Markus (2024), Current State of Sea Ice Cover, https://earth.gsfc.nasa.gov/cryo/data/current-state-sea-ice-cover, last access: 2024-07-16.
&lt;/section>
&lt;hr>
&lt;h2 id="ice-concentrations-as-probabilities">Ice concentrations as probabilities&lt;/h2>
&lt;ul>
&lt;li>the spatial footprint of GNSS-R grid is around 2.5% of the size of the IUP grid&lt;/li>
&lt;li>interprit ice concentration as the probabiliy of seeing that kind of ice inside that grid&lt;/li>
&lt;li>assume that the ice concentrations can be erroneous&lt;/li>
&lt;li>assume higher ice concentrations are more likely to be correct&lt;/li>
&lt;li>semi-supervised approach: learn a mapping from GNSS-R to ice labels where IUP concentrations are close to 100%&lt;/li>
&lt;li>update the IUP concentrations using the learned mapping&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section>
&lt;h2 id="bayesian-update">Bayesian update&lt;/h2>
&lt;div style="font-size: 0.4em; margin-top: 100px">
$$
\begin{align*}
P(true\, label\, |\, model\, pred) &amp;= \frac{P(model\, pred\, |\, true\, label) \times P(true\, label)}{P(model\, pred)} \\
&amp;= \frac{P(model\, pred\, |\, true\, label) \times P(true\, label)}{\sum_{true\, labels}{P(model\, pred\, ,\, true\, label)}} \\
&amp;= \frac{P(model\, pred\, |\, true\, label) \times P(true\, label)}{\sum_{true\, labels}{P(model\, pred\, |\, true\, label) \times P(true\, label)}}
\end{align*}
$$
&lt;/div>
&lt;hr>
&lt;p>Two ways to get $P(model\ pred\ |\ true\ label)$:&lt;/p>
&lt;span class="fragment " >
&lt;ul>
&lt;li>Using a tabular ML model - model calibration required
&lt;ul>
&lt;li>calibrate using test dataset and Bayes rule again&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/span>
&lt;div style="font-size:0.4em">
&lt;span class="fragment " >
$$
P(pred\ |\ true) = \frac{P(true\ |\ pred)P(pred)}{\sum_{pred\ labels}{P(true\ |\ pred)P(pred)}}
$$
&lt;/span>
&lt;/div>
&lt;span class="fragment " >
&lt;ul>
&lt;li>Using Robust Mixture Discriminent Analysis (RMDA)&lt;/li>
&lt;/ul>
&lt;p class="citation"> Bouveyron, C., &amp; Girard, S. (2009). Robust supervised classification with mixture models: Learning from data with uncertain labels. Pattern Recognition, 42(11), 2649–2658. https://doi.org/10.1016/j.patcog.2009.03.027
&lt;/span>
&lt;/section>
&lt;hr>
&lt;div class="multi-column">
&lt;div class="col1" style="margin-top:150px">
&lt;h2 id="which-ml-model">Which ML model?&lt;/h2>
&lt;/div>
&lt;div class="col4" >
&lt;img src="./img/skmodels accuracy comparison.png" width=300px>
&lt;img src="./img/skmodels f1 comparison.png" width=300px>
&lt;div>
&lt;img src="./img/skmodels matthews corr coef comparison.png" width=300px>
&lt;img src="./img/skmodels fit time comparison.png" width=300px>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;hr>
&lt;section>
&lt;h2 id="decision-tree-ensemble-methods">Decision tree ensemble methods&lt;/h2>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Random Forest&lt;/strong>: Fits an ensemble of N trees on random subsets (known as bagging) of data and predicts with a majority vote.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Adaptive Gradient Boosting&lt;/strong>: Trains N trees sequentially, each tree correcting the errors of the previous tree by weighting the data points that were misclassified. This is known as boosting.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Gradient Boosting&lt;/strong>: Also a boosting method in that it trains N trees sequentially, however, instead of weighting the data points, it fits each tree to the residuals of the previous tree.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;hr>
&lt;h2 id="lightgbm">LightGBM&lt;/h2>
&lt;p>Key improvements over vanilla gradient boosting:&lt;/p>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Histogram-based splitting&lt;/strong>: LightGBM bins the data points into discrete bins and then splits the bins instead of the data points. This reduces the complexity of the model and speeds up training.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Leaf-wise growth&lt;/strong>: Instead of growing the tree level-wise, LightGBM grows the tree leaf-wise. This reduces the number of nodes in the tree and hence the complexity of the model.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Gradient-based One-Side Sampling&lt;/strong>: LightGBM samples the data points based on the gradient of the loss function. This speeds up training by focusing on the data points that are more informative.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>&lt;strong>Exclusive Feature Bundling&lt;/strong>: LightGBM bundles exclusive features together to reduce the number of features that need to be considered during training.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/section>
&lt;hr>
&lt;section>
&lt;h2 id="gaussian-mixture-discriminant-analysis">Gaussian Mixture Discriminant Analysis&lt;/h2>
&lt;div style="font-size: 0.4em;">
\[
\begin{align*}
G &amp;\sim Discrete(K) \\
C &amp;\sim Discrete(L) \\
X &amp;\in \mathbb{R}^p \\
p(x|C=i) &amp;= \sum_{j=1}^{K} P(C=i,G=j)p(x|G=j) \\
p(x) &amp;= \sum_{i=1}^L\sum_{j=1}^{K} P(C=i,G=j)p(x|G=j) \\
&amp;= \sum_{i=1}^L\sum_{j=1}^{K} \pi_{ij}\phi(x;\mu_ij,\Sigma_ij)
\end{align*}
\]
Here $\pi_{ij}$ is the mixing coefficient such that $\sum_{j=1}^K \pi_{ij} = 1$, and $\phi(x;\mu_ij,\Sigma_ij)$ is the multivariate Gaussian distribution with mean $\mu_ij$ and covariance $\Sigma_ij$.
&lt;/div>
&lt;hr>
&lt;h2 id="robust-mixture-discriminant-analysis">Robust Mixture Discriminant Analysis&lt;/h2>
&lt;div style="font-size: 0.4em;">
\[
\begin{align*}
p(x) &amp;= \sum_{i=1}^L\sum_{j=1}^{K} P(C=i,G=j)p(x|G=j) \\
&amp;= \sum_{i=1}^L\sum_{j=1}^{K} P(C=i|G=j)P(G=j)p(x|G=j) \\
&amp;= \sum_{i=1}^L\sum_{j=1}^{K} r_{ij}\pi_j\phi(x;\mu_j,\Sigma_j)
\end{align*}
\]
Since $\pi_j$ does not depend on $C$, we can fit a gaussian mixture model (unsupervised) to get $\pi_j$. To get the, $L\times K$, matrix $R=(r_{ij})$ parameters, we maximise the log likelihood:
\[
l(R) = \sum_{i=1}^{L}\sum_{x\in \mathcal{C}_i} log(R_i\Psi(x))
\]
where $\Psi(x) = (P(S=1|X=x),P(S=1|X=x),...,P(S=K|X=x))^t$ and $\mathcal{C}_i=\{x_l\}$
such that $x_l$ belongs to class $C=i$,
w.r.t. $r_{ij}\in [0,1], \forall i\in {1,...,L}\text{ and }j\in{1,...,K}$,
subject to $\sum_{i=1}^{L} r_{ij}=1, \forall j=1,...,K$.
&lt;/div>
&lt;hr>
&lt;img src="./img/MDA vs RMDA paper figure.png" height="640px">
&lt;/section>
&lt;hr>
&lt;section>
&lt;h2 id="class-rebalancing">Class rebalancing&lt;/h2>
&lt;!-- LGBM confusion matrix
['YI_conc', 'FYI_conc', 'MYI_conc', 'water_conc']
[[0.59237875 0.27944573 0.0369515 0.09122402]
[0.02884615 0.94346154 0.01884615 0.00884615]
[0.08886389 0.64904387 0.22497188 0.03712036]
[0.02849003 0.01745014 0.00747863 0.9465812 ]] -->
&lt;div class="multi-column">
&lt;div class="col1">
&lt;span class="fragment " >
&lt;ul>
&lt;li>Training score: 83.7%&lt;/li>
&lt;li>Validation score: 81.3%&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;div class="col1" style="font-size:0.4em">
&lt;span class="fragment " >
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>YI&lt;/th>
&lt;th>FYI&lt;/th>
&lt;th>MYI&lt;/th>
&lt;th>water&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>YI&lt;/strong>&lt;/td>
&lt;td>0.59237875&lt;/td>
&lt;td>0.27944573&lt;/td>
&lt;td>0.0369515&lt;/td>
&lt;td>0.09122402&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>FYI&lt;/strong>&lt;/td>
&lt;td>0.02884615&lt;/td>
&lt;td>0.94346154&lt;/td>
&lt;td>0.01884615&lt;/td>
&lt;td>0.00884615&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>MYI&lt;/strong>&lt;/td>
&lt;td>0.08886389&lt;/td>
&lt;td>0.64904387&lt;/td>
&lt;td>0.22497188&lt;/td>
&lt;td>0.03712036&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>water&lt;/strong>&lt;/td>
&lt;td>0.02849003&lt;/td>
&lt;td>0.01745014&lt;/td>
&lt;td>0.00747863&lt;/td>
&lt;td>0.9465812&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/span>
&lt;/div>
&lt;/div>
&lt;span class="fragment " >
&lt;ul>
&lt;li>The entire dataset contains 7.39M rows.&lt;/li>
&lt;li>After filtering rows where we have high confidence in labels (YI&amp;gt;90%, FYI&amp;gt;99.9%, MYI&amp;gt;99.%,Water=100%), we are left with:
&lt;ul>
&lt;li>YI: 9801 rows&lt;/li>
&lt;li>FYI: 28266 rows&lt;/li>
&lt;li>MYI: 9805 rows&lt;/li>
&lt;li>Water: 3.18M rows&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;p style="font-size=0.6em">
&lt;p>&lt;strong>Solution&lt;/strong>: SMOTE based class rebalancing&lt;/p>
&lt;/span>
&lt;hr>
&lt;h2 id="smote">SMOTE&lt;/h2>
&lt;ul>
&lt;li>Synthetic Minority Over-sampling Technique
&lt;ol>
&lt;li>Identify the minority class&lt;/li>
&lt;li>Find its nearest neighbours&lt;/li>
&lt;li>Generate synthetic samples by interpolating between the minority class and its neighbours&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>After SMOTE each class contains 3.18M rows&lt;/li>
&lt;li>So we randomly undersampled and treated the number of samples in each class as a hyperparameter&lt;/li>
&lt;li>More sophisticated undersampling techniques (Tomek, Edited Nearest Neighbours) did not prune the dataset enough&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="performance-after-class-rebalancing">Performance after class rebalancing&lt;/h2>
&lt;div style="font-size:0.4em">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Number of resampled rows in each class&lt;/td>
&lt;td>~1.6M&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Training accuracy&lt;/td>
&lt;td>99.34%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Validation accuracy&lt;/td>
&lt;td>99.12%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Test accuracy&lt;/td>
&lt;td>96.34%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;p>&lt;strong>Test Confusion matrix&lt;/strong>&lt;/p>
&lt;div style="font-size:0.4em">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>YI&lt;/th>
&lt;th>FYI&lt;/th>
&lt;th>MYI&lt;/th>
&lt;th>water&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>YI&lt;/strong>&lt;/td>
&lt;td>0.67241379&lt;/td>
&lt;td>0.15922921&lt;/td>
&lt;td>0.04158215&lt;/td>
&lt;td>0.12677485&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>FYI&lt;/strong>&lt;/td>
&lt;td>0.04790419&lt;/td>
&lt;td>0.84677703&lt;/td>
&lt;td>0.08559352&lt;/td>
&lt;td>0.01972526&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>MYI&lt;/strong>&lt;/td>
&lt;td>0.046875&lt;/td>
&lt;td>0.25878906&lt;/td>
&lt;td>0.64257812&lt;/td>
&lt;td>0.05175781&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>water&lt;/strong>&lt;/td>
&lt;td>0.01741254&lt;/td>
&lt;td>0.00994283&lt;/td>
&lt;td>0.00626461&lt;/td>
&lt;td>0.96638002&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;/section>
&lt;hr>
&lt;h2 id="updated-ice-probability-density">Updated ice probability density&lt;/h2>
&lt;img src="./img/updated non-zero ice distributions.png" height=600px>
&lt;hr>
&lt;section>
&lt;h2 id="umap-uniform-manifold-approximation-and-projection">UMAP (Uniform Manifold Approximation and Projection)&lt;/h2>
&lt;span class="fragment " >
&lt;ul>
&lt;li>UMAP is a non-linear dimensionality reduction technique that is particularly well-suited for visualizing complex data in a low-dimensional space&lt;/li>
&lt;li>UMAP assumes that high-dimensional data lies on a low-dimensional manifold embedded in the higher-dimensional space&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>UMAP first constructs a weighted k-nearest neighbor graph from the high-dimensional data&lt;/li>
&lt;li>UMAP then defines a low-dimensional representation and uses stochastic gradient descent to optimize the layout of the low-dimensional points, preserving the structure of the high-dimensional graph as closely as possible&lt;/li>
&lt;/ul>
&lt;/span>
&lt;span class="fragment " >
&lt;ul>
&lt;li>It thus preserves both local and global structure of the data&lt;/li>
&lt;li>Supervised mode:
&lt;ul>
&lt;li>During knn graph construction, UMAP uses the labels to weight the edges in the graph so that points with the same label are closer in the low-dimensional space&lt;/li>
&lt;li>Also adds a loss term to the optimization function that penalizes points with the same label being far apart in the low-dimensional space&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/span>
&lt;hr>
&lt;img src="./img/umap transformation.png">
&lt;/section>
&lt;hr>
&lt;h2 id="performance-after-umap-feature-transformation">Performance after UMAP feature transformation&lt;/h2>
&lt;div style="font-size:0.4em">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Number of resampled rows in each class&lt;/td>
&lt;td>~300K&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Training accuracy&lt;/td>
&lt;td>99.99%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Validation accuracy&lt;/td>
&lt;td>99.84%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Test accuracy&lt;/td>
&lt;td>93.34%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;p>&lt;strong>Test Confusion matrix&lt;/strong>&lt;/p>
&lt;div style="font-size:0.4em">
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>YI&lt;/th>
&lt;th>FYI&lt;/th>
&lt;th>MYI&lt;/th>
&lt;th>water&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>YI&lt;/strong>&lt;/td>
&lt;td>0.95866935&lt;/td>
&lt;td>0.02116935&lt;/td>
&lt;td>0.0141129&lt;/td>
&lt;td>0.00604839&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>FYI&lt;/strong>&lt;/td>
&lt;td>0.01243201&lt;/td>
&lt;td>0.95648796&lt;/td>
&lt;td>0.02175602&lt;/td>
&lt;td>0.00932401&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>MYI&lt;/strong>&lt;/td>
&lt;td>0.01208459&lt;/td>
&lt;td>0.02819738&lt;/td>
&lt;td>0.95166163&lt;/td>
&lt;td>0.00805639&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>water&lt;/strong>&lt;/td>
&lt;td>0.04524251&lt;/td>
&lt;td>0.00951845&lt;/td>
&lt;td>0.02106578&lt;/td>
&lt;td>0.92417327&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/div>
&lt;hr>
&lt;h2 id="updated-ice-probability-density-1">Updated ice probability density&lt;/h2>
&lt;img src="./img/UMAP updated non-zero ice distributions.png" height=600px>
&lt;hr>
&lt;h3 id="updated-geographic-distribution---winter">Updated geographic distribution - winter&lt;/h3>
&lt;div class="multi-column">
&lt;div class="col2">
&lt;p>&lt;span class="fragment " >
&lt;img src="./img/Original U.Brem sure ice class geographic distribution - winter.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="./img/UMAP-LGBM updated sure ice class geographic distribution - winter.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="./img/UMAP-RMDA updated sure ice class geographic distribution - winter.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;img src="./img/sod_ant_20200917.png" style="margin-top: 150px">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h3 id="updated-geographic-distribution---summer">Updated geographic distribution - summer&lt;/h3>
&lt;div class="multi-column">
&lt;div class="col2">
&lt;p>&lt;span class="fragment " >
&lt;img src="./img/Original U.Brem sure ice class geographic distribution - summer.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="./img/UMAP-LGBM updated sure ice class geographic distribution - summer.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="./img/UMAP-RMDA updated sure ice class geographic distribution - summer.png" style="margin-bottom: 0px; margin-top: 0px;">
&lt;/span>&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;img src="./img/sod_ant_20200213.png" style="margin-top: 150px">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1493329087152-853abc04b84b?q=80&amp;w=3876&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.3>
&lt;ul>
&lt;li>GNSS-R is an exciting new data source for monitoring sea ice at very high resolution&lt;/li>
&lt;li>we fitted models to predict ice types and used these models to update existing ice concentration datasets&lt;/li>
&lt;li>we pick up many more locations previously missed thus improving existing ice concentration datasets&lt;/li>
&lt;li>this was only one year of data, the accuracy will improve with more data&lt;/li>
&lt;/ul>
&lt;p>More validation required! We are open to ideas&lt;/p>
&lt;/section>
&lt;hr>
&lt;h1 id="thank-you">Thank you&lt;/h1>
&lt;img style="border-radius:50%" src="https://s.gravatar.com/avatar/f5f75fa059f22f822b95edc56c68930873203f6816d9d0864a75d82b2453e52f?s=250">
&lt;br/>
&lt;a href="mailto:s.contractor@unsw.edu.au" aria-label="envelope">
&lt;i class="fas fa-envelope big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://twitter.com/stefancontracto" target="_blank" rel="noopener" aria-label="twitter">
&lt;i class="fab fa-twitter big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://scholar.google.co.uk/citations?user=sEnHZ3AAAAAJ" target="_blank" rel="noopener" aria-label="google-scholar">
&lt;i class="ai ai-google-scholar big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://github.com/steefancontractor" target="_blank" rel="noopener" aria-label="github">
&lt;i class="fab fa-github big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://www.linkedin.com/in/steefan-contractor-b375bb209/" target="_blank" rel="noopener" aria-label="linkedin">
&lt;i class="fab fa-linkedin big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://mastodon.au/@stefancontracto" target="_blank" rel="noopener" aria-label="mastodon">
&lt;i class="fab fa-mastodon big-icon">&lt;/i>
&lt;/a></description></item><item><title>Deep Sea Talk</title><link>https://steefancontractor.github.io/slides/deep-sea-talk/</link><pubDate>Tue, 28 Nov 2023 11:11:50 +1100</pubDate><guid>https://steefancontractor.github.io/slides/deep-sea-talk/</guid><description>&lt;link rel="stylesheet" href="reveal_custom.css">
&lt;link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
&lt;div class="vert-banner">
&lt;img src="./img/UNSW_logo-portrait-light_transparent.png">
&lt;img src="./img/Udash logo CMYK revised-03.png">
&lt;img src="https://antarctic.org.au/wp-content/uploads/2021/10/ACEAS-Logo-Concept-1-WHT.png" style="height:10vh">
&lt;/div>
&lt;script>
function add_vert_banner() {
let vertbanner = document.querySelector("div.vert-banner");
let reveal = document.querySelector(".reveal");
reveal.insertBefore(vertbanner, reveal.firstChild);
}
window.onload = add_vert_banner();
&lt;/script>
&lt;section data-background-image="https://images.unsplash.com/photo-1445112098124-3e76dd67983c?q=80&amp;w=3872&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.5>
&lt;h1 id="observing-oceans-from-above">Observing oceans from above&lt;/h1>
&lt;p>Dr. Steefan Contractor&lt;/p>
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1445112098124-3e76dd67983c?q=80&amp;w=3872&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.5>
&lt;h1 id="a-lot-happens-on-the-ocean-surface">A lot happens on the ocean surface&lt;/h1>
&lt;ul>
&lt;li>Interaction with the atmosphere
&lt;ul>
&lt;li>air sea heat fluxes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Interaction with the biosphere
&lt;ul>
&lt;li>confluence of birds, fish and mammals&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Fisheries, transport, tourism and other commercial activities&lt;/li>
&lt;li>Political and Defence related activities&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1528989292939-085939b5722d?q=80&amp;w=5166&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=1>&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1549598685-0058b114c9d6?q=80&amp;w=3078&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.5>
&lt;h1 id="sea-ice">Sea Ice&lt;/h1>
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1549598685-0058b114c9d6?q=80&amp;w=3078&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.5>
&lt;h1 id="sea-ice-1">Sea Ice&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;img src="https://www.climate.gov/sites/default/files/styles/full_width_620_original_image/public/2023-03/ClimateDashboard-Antarctic-sea-ice-winter-maximum-graph-20230307-1400px.jpg?itok=I9czOFwT" style="height:20vh">
&lt;br>
&lt;img src="https://www.climate.gov/sites/default/files/styles/full_width_620_original_image/public/2023-03/ClimateDashboard-Antarctic-sea-ice-winter-maximum-map_202209_1400px.jpg?itok=G1R5j_vY" style="height:20vh">
&lt;/div>
&lt;div class="col1">
&lt;img src="https://www.climate.gov/sites/default/files/styles/full_width_620_original_image/public/2023-03/ClimateDashboard-Antarctic-sea-ice-summer-minimum-graph-20230307-1400px.jpg?itok=qHONdopL" style="height:20vh">
&lt;br>
&lt;img src="https://www.climate.gov/sites/default/files/styles/full_width_620_original_image/public/2023-03/ClimateDashboard-Antarctic-sea-ice-summer-minimum-map_202302_1400px.jpg?itok=i-GXfY7N" style="height:20vh">
&lt;/div>
&lt;/div>
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://www.climate.gov/sites/default/files/styles/full_width_stretch_featured_image/public/2021-10/Antarctic_seaice_Oct_v_Jan_250m.jpg?itok=3hqMmRJT" data-background-position="right" data-background-size="contain">
&lt;p style="text-align: left">Detecting sea ice from space
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1446776811953-b23d57bd21aa?q=80&amp;w=4928&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.4>
&lt;h1 id="uk-aus-space-bridge-grant">UK-Aus Space bridge Grant&lt;/h1>
&lt;img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse1.mm.bing.net%2Fth%3Fid%3DOIP.7wdx11_LaJpRT8sZPc_EcwHaEK%26pid%3DApi&amp;f=1&amp;ipt=bcce3e8299435589a46a0176f72bae31a5eba61677ccbd50bc5f30a9a619dac8&amp;ipo=images" style="border-radius:20%">
&lt;div class="multi-column">
&lt;div class="col1">
&lt;img src="https://spire.com/wp-content/themes/spire2021/img/spire-global-cubesat-satellite-logo.svg">
&lt;/div>
&lt;div class="col1">
&lt;img src="https://antarctic.org.au/wp-content/uploads/2021/10/ACEAS-Logo-Concept-1-WHT.png" style="height:10vh">
&lt;/div>
&lt;/div>
&lt;/section>
&lt;hr>
&lt;h1 id="global-navigation-satellite-system---reflectometry">Global Navigation Satellite System - Reflectometry&lt;/h1>
&lt;img src=./img/GNSS-R.jpg style="border-radius:10%">
&lt;hr>
&lt;h1 id="gnss-r-features">GNSS-R Features&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
Phase Noise
&lt;img src="./img/phase-noise.jpg">
&lt;/div>
&lt;div class="col1">
Excess Phase Noise
&lt;img src="./img/excess-phase-noise.jpg">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;section data-background-image="img/feature-distribution.png" data-background-position="center" data-background-size="contain">
&lt;/section>
&lt;hr>
&lt;section data-background-image="img/label-distribution.png" data-background-position="center" data-background-size="contain">
&lt;/section>
&lt;hr>
&lt;h1 id="waterice-and-ice-type-classification">water/ice and ice type classification&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;img src="./img/reduced-feature-scatter_waterice.png" style="height:30vh">
&lt;/div>
&lt;div class="col1">
&lt;img src="./img/reduced-feature-scatter-icetypes.png" style="height:30vh">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="thank-you">Thank you&lt;/h1>
&lt;img style="border-radius:50%" src="https://s.gravatar.com/avatar/f5f75fa059f22f822b95edc56c68930873203f6816d9d0864a75d82b2453e52f?s=250">
&lt;br/>
&lt;a href="mailto:s.contractor@unsw.edu.au" aria-label="envelope">
&lt;i class="fas fa-envelope big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://twitter.com/stefancontracto" target="_blank" rel="noopener" aria-label="twitter">
&lt;i class="fab fa-twitter big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://scholar.google.co.uk/citations?user=sEnHZ3AAAAAJ" target="_blank" rel="noopener" aria-label="google-scholar">
&lt;i class="ai ai-google-scholar big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://github.com/steefancontractor" target="_blank" rel="noopener" aria-label="github">
&lt;i class="fab fa-github big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://www.linkedin.com/in/steefan-contractor-b375bb209/" target="_blank" rel="noopener" aria-label="linkedin">
&lt;i class="fab fa-linkedin big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://mastodon.au/@stefancontracto" target="_blank" rel="noopener" aria-label="mastodon">
&lt;i class="fab fa-mastodon big-icon">&lt;/i>
&lt;/a></description></item><item><title>Intro</title><link>https://steefancontractor.github.io/slides/year10-data-sci-workshop/intro/</link><pubDate>Fri, 17 Nov 2023 17:22:03 +1100</pubDate><guid>https://steefancontractor.github.io/slides/year10-data-sci-workshop/intro/</guid><description>&lt;link rel="stylesheet" href="../reveal_custom.css">
&lt;link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
&lt;div class="vert-banner">
&lt;img src="../img/UNSW_logo-portrait-light_transparent.png">
&lt;img src="../img/Udash logo CMYK revised-03.png">
&lt;/div>
&lt;script>
function add_vert_banner() {
let vertbanner = document.querySelector("div.vert-banner");
let reveal = document.querySelector(".reveal");
reveal.insertBefore(vertbanner, reveal.firstChild);
}
window.onload = add_vert_banner();
&lt;/script>
&lt;!-- &lt;section data-background-image="https://media.giphy.com/media/4FQMuOKR6zQRO/giphy.gif"
data-background-opacity=0.1
data-background-size=cover> -->
&lt;h1 id="year-10-data-science-work-experience-week">Year 10 Data Science Work Experience Week&lt;/h1>
&lt;p>Welcome!&lt;/p>
&lt;!-- &lt;/section> -->
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1669574034940-27c886b6f5bb?q=80&amp;w=3540&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D"
data-background-opacity=0.3
data-background-size=cover>
&lt;!--
&lt;section data-noprocess data-shortcode-slide
data-background-image="https://media.giphy.com/media/3oKIPEqDGUULpEU0aQ/giphy.gif"
>
-->
&lt;h1 id="acknowledgement-of-country">Acknowledgement of country&lt;/h1>
&lt;p>UNSW Kensington Campus is located on the unceded lands of the Bedigal people. We pay our respects to their Elders, past and present, as the Traditional Custodians of this land.&lt;/p>
&lt;/section>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1552879890-3a06dd3a06c2?q=80&amp;w=2754&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.4>
&lt;h1 id="health-and-safety-induction">Health and Safety Induction&lt;/h1>
&lt;ul>
&lt;li>We are following the HS414 Visitors to UNSW Facilities Guideline&lt;/li>
&lt;li>Visitors will be met by the person organizing the visit&lt;/li>
&lt;li>Visitors need to be accompanied and supervised by UNSW staff members or teaching assistants&lt;/li>
&lt;li>Visitors are not allowed in medium-or high-risk areas&lt;/li>
&lt;li>Please fill the HS630 Visitor Induction Form with your contact details and check all items that apply to the induction&lt;/li>
&lt;li>All visitors must agree to follow any reasonable instruction in relation to health and safety.&lt;/li>
&lt;li>Report all work related hazards, incidents, injuries and illnesses to supervising staff members or teaching assistants&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;h1 id="general-safety-in-the-classroom">General safety in the classroom&lt;/h1>
&lt;ul>
&lt;li>No special gear is required&lt;/li>
&lt;li>No food or drinks in the classroom&lt;/li>
&lt;li>Please keep the classroom clean and tidy&lt;/li>
&lt;li>Locate first aid equipment&lt;/li>
&lt;li>Locate fire extinguishers&lt;/li>
&lt;/ul>
&lt;p>You are responsible for the safety of you and others around you - Take care!&lt;/p>
&lt;ul>
&lt;li>if you see something unsafe, tell your teacher or other staff member&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-background-image="https://images.unsplash.com/photo-1583947582886-f40ec95dd752?q=80&amp;w=2670&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.4>
&lt;h1 id="cold--flu-safety">Cold &amp;amp; Flu safety&lt;/h1>
&lt;ul>
&lt;li>You are welcome to use face masks and are even encouraged to use them&lt;/li>
&lt;li>Sanitise/wash hands regularly&lt;/li>
&lt;li>if you feel unwell, please stay at home&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;section data-background-image="./evacuation diagram d26.jpeg" data-background-position="right" data-background-size="contain">
&lt;h1 id="emergency-procedure">Emergency Procedure&lt;/h1>
&lt;p>In case of an emergency&lt;/p>
&lt;ul>
&lt;li>call UNSW Security Services&lt;/li>
&lt;li>do not call 000&lt;/li>
&lt;/ul>
&lt;p>Security Services&lt;/p>
&lt;ul>
&lt;li>
&lt;p>in an emergency &lt;i class="fas fa-solid fa-phone">&lt;/i> 9385 6666&lt;/p>
&lt;/li>
&lt;li>
&lt;p>everything else &lt;i class="fas fa-solid fa-phone">&lt;/i> 9385 6000&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Security Office located at Gate 2, open 24/7&lt;/p>
&lt;/li>
&lt;/ul>
&lt;a href="https://www.student.unsw.edu.au/security">
&lt;i class="fas fa-solid fa-home">&lt;/i>
&lt;/a>
&lt;hr>
&lt;h1 id="emergency-evacuation">Emergency Evacuation&lt;/h1>
&lt;ul>
&lt;li>BEEP BEEP: prepare to evacuate. Do not leave until WHOOP WHOOP alarm&lt;/li>
&lt;li>WHOOP WHOOP: evacuate immediately. Follow staff and building warden instructions&lt;/li>
&lt;li>Assemble infront of John Clancy auditorium&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;h1 id="first-aid---physical">First Aid - Physical&lt;/h1>
&lt;ul>
&lt;li>Location of first aid kit and first aid room (Rm G003, E26)&lt;/li>
&lt;li>Location of Automatic External Defibrillator&lt;/li>
&lt;li>Any incident requiring the use of first aid, however minor, must be reported online to UNSW&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-background="https://images.unsplash.com/photo-1656501378122-928d5ff4153e?q=80&amp;w=2664&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.4>
&lt;h1 id="first-aid---mental">First Aid - Mental&lt;/h1>
&lt;p>&lt;a href="https://www.unsw.edu.au/planning-assurance/safety/safer-communities/gendered-violence/find-first-responders" target="_blank" rel="noopener">UNSW First Responders&lt;/a> are students and staff who are trained to offer you confidential support. They understand that reporting gendered violence can be difficult and can provide you with guidance and support.&lt;/p>
&lt;p>Scott Sisson, uDASH director &lt;a href="mailto:uDASH@unsw.edu.au" aria-label="envelope">
&lt;i class="fas fa-envelope big-icon">&lt;/i>
&lt;/a>&lt;/p>
&lt;p>You can also contact a certified &lt;a href="https://www.edi.unsw.edu.au/get-involved/ally-network" target="_blank" rel="noopener">ally@UNSW&lt;/a> in the School of Maths and Stats, Faculty of Science. The ally@UNSWnetwork aims to ensure UNSW is a safe and welcoming place for all LGBTIQ+ students and staff.&lt;/p>
&lt;/section>
&lt;hr>
&lt;iframe width="800" height="600" frameBorder="0" scrolling="no" marginHeight="0" marginWidth="0"src="https://use.mazemap.com/embed.html#v=1&amp;config=unsw&amp;campusid=111&amp;zlevel=7&amp;center=151.235024,-33.917283&amp;zoom=18&amp;sharepoitype=poi&amp;sharepoi=1001050575&amp;utm_medium=iframe" style={{ border: '1px solid grey' }} allow="geolocation">&lt;/iframe>&lt;br/>&lt;small>&lt;a href="https://www.mazemap.com/">Map by MazeMap&lt;/a>&lt;/small>
&lt;hr>
&lt;h1 id="unsw-year-10-data-science-work-experience-week">UNSW Year 10 Data Science Work Experience Week&lt;/h1>
&lt;h2 id="aim">Aim&lt;/h2>
&lt;p>Learn the basic principles of data science. Exploratory data analysis, statistical modelling and visualisation.&lt;/p>
&lt;h2 id="students-will">Students will&lt;/h2>
&lt;ul>
&lt;li>learn basic programming concepts&lt;/li>
&lt;li>get hands-on experience in analysing real-world datasets&lt;/li>
&lt;li>work in independent teams on data science projects&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="unsw-data-science-hub-udash">UNSW Data Science Hub (uDASH)&lt;/h1>
&lt;p>An official UNSW Research Centre in the School of Mathematics &amp;amp; Statistics&lt;/p>
&lt;ul>
&lt;li>formally established in 2021&lt;/li>
&lt;/ul>
&lt;p>Aim&lt;/p>
&lt;ul>
&lt;li>Bring together UNSW&amp;rsquo;s full spectrum of data specialists to solve complex, real-world challenges, faced by governments and businesses&lt;/li>
&lt;/ul>
&lt;p>Data experts across UNSW&lt;/p>
&lt;ul>
&lt;li>100+ data experts across UNSW&amp;rsquo;s broad and diverse faculties (Science, Arts, Engineering, Medicine, Law, Business, and Aus. Defence Force Academy)&lt;/li>
&lt;li>All experts researching cutting edge applications using data science tools&lt;/li>
&lt;/ul>
&lt;p>&lt;q> We translate large volumes of data into knowledge to support decision-making. ​&lt;/q>&lt;/p>
&lt;hr>
&lt;h1 id="udash">uDASH&lt;/h1>
&lt;p>Our expertise:&lt;/p>
&lt;ul>
&lt;li>Mathematics and Statistics&lt;/li>
&lt;li>Machine Learning and Artificial Intelligence&lt;/li>
&lt;li>Data Visualisation&lt;/li>
&lt;li>Computational modelling and simulation&lt;/li>
&lt;li>Non-linear dynamics and optimisation&lt;/li>
&lt;li>Data privacy&lt;/li>
&lt;li>Probablistic modelling&lt;/li>
&lt;li>Risk quantification and management&lt;/li>
&lt;li>Business, economics, and marketing&lt;/li>
&lt;li>Spatial modelling&lt;/li>
&lt;li>Big and complex data&lt;/li>
&lt;li>Genomics and medical data&lt;/li>
&lt;li>Ecological, environmental and climate data&lt;/li>
&lt;li>Defence research&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="workshop-program">Workshop Program&lt;/h1>
&lt;section data-background-image="https://images.unsplash.com/photo-1435527173128-983b87201f4d?q=80&amp;w=2667&amp;auto=format&amp;fit=crop&amp;ixlib=rb-4.0.3&amp;ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D" data-background-opacity=0.4>
&lt;h1 id="morning-sessions-930am---1200pm">Morning sessions 9:30am - 12:00pm&lt;/h1>
&lt;ul>
&lt;li>Interactive lecture style sessions 10:00am - 12:00pm&lt;/li>
&lt;li>Special talks 9:30 - 10:00am&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="lunch-1200pm---130pm">Lunch 12:00pm - 1:30pm&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>Free time to explore, get food, etc.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Optional but recommended fun activities&lt;/p>
&lt;ul>
&lt;li>School of Mathematics and Statistics outreach workshops (45min)&lt;/li>
&lt;li>Datasoc campus tour (30min)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="afternoon-sessions-130pm---400pm">Afternoon sessions 1:30pm - 4:00pm&lt;/h1>
&lt;ul>
&lt;li>Independent project work&lt;/li>
&lt;li>groups of 5-6 students&lt;/li>
&lt;li>Goal: visualise patterns, postulate hypothesis, statistical analysis and discussion&lt;/li>
&lt;li>Demonstrators and instructors will give advice and recommendations&lt;/li>
&lt;/ul>
&lt;/section>
&lt;hr>
&lt;section>
&lt;h1 id="monday">Monday&lt;/h1>
&lt;p>Morning session:&lt;/p>
&lt;ul>
&lt;li>Induction&lt;/li>
&lt;li>Introduction to data science&lt;/li>
&lt;li>Software + Intro to programming with R&lt;/li>
&lt;/ul>
&lt;p>Afternoon session:&lt;/p>
&lt;ul>
&lt;li>Intro to datasets&lt;/li>
&lt;li>Project group selection&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="tuesday">Tuesday&lt;/h1>
&lt;p>Morning session:&lt;/p>
&lt;ul>
&lt;li>Meet a data scientist
&lt;ul>
&lt;li>Nick Lillywhite, &lt;a href="https://www.bioscout.com.au" target="_blank" rel="noopener">BioScout&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Exploratory data analysis&lt;/li>
&lt;/ul>
&lt;p>Afternoon session:&lt;/p>
&lt;ul>
&lt;li>Afternoon session: Work on projects (data exploration)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="wednesday">Wednesday&lt;/h1>
&lt;p>Morning session:&lt;/p>
&lt;ul>
&lt;li>Meet a data scientist
&lt;ul>
&lt;li>Peter Hartmann, Westpac Group&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Statistical Modelling&lt;/li>
&lt;/ul>
&lt;p>Afternoon session:&lt;/p>
&lt;ul>
&lt;li>Work on projects (statistical modelling)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="thursday">Thursday&lt;/h1>
&lt;p>Morning session:&lt;/p>
&lt;ul>
&lt;li>Datasoc: who they are, how can they improve your student experience&lt;/li>
&lt;li>Data visualisation&lt;/li>
&lt;/ul>
&lt;p>Afternoon session:&lt;/p>
&lt;ul>
&lt;li>Work on projects (visualising results and wrap up)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="friday">Friday&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;ul>
&lt;li>Presentations&lt;/li>
&lt;li>Program wrap-up&lt;/li>
&lt;/ul>
&lt;img src="presentation2.jpeg" style="border-radius:50%; height: 40vh" >
&lt;/div>
&lt;div class="col1">
&lt;img src="presentation1.jpeg" style="border-radius:50%; height: 30vh" >
&lt;/div>
&lt;/div>
&lt;/section>
&lt;hr>
&lt;section data-background="./group.jpeg" data-background-opacity=0.7>
&lt;/section>
&lt;hr>
&lt;h1 id="your-instructors">Your instructors&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;p>Dr. Steefan Contractor
&lt;img src="https://s.gravatar.com/avatar/f5f75fa059f22f822b95edc56c68930873203f6816d9d0864a75d82b2453e52f?s=250" height=180px>&lt;/p>
&lt;p>Dr. Ziyang Lyu
&lt;img src="https://research.unsw.edu.au/sites/default/files/styles/profile/public/images/profile/IMG_2294.JPG?itok=L2ykujSb" height=180px>&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;p>Dr. José Ferrer
&lt;img src="https://api.research.unsw.edu.au/sites/default/files/images/profile/JR_Ferrer_Paris.jpg" height=180px>&lt;/p>
&lt;p>Dr. Maeve McGillycuddy
&lt;img src="https://www.unsw.edu.au/content/dam/images/science/math-stats/news/2022-01-science/2022-04-maeve-McGillycuddy-001.jpeg" height=180px>&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;p>Dr. Sean Gardiner
&lt;img src="https://api.research.unsw.edu.au/sites/default/files/images/profile/Sean_Gardiner_photo2_cropped.png" height=180px>&lt;/p>
&lt;p>Dr. Peng Zhong
&lt;img src="https://pangchung.github.io/images/profile.jpg" height=180px>&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;p>Dr. Boris Beranger
&lt;img src="https://research.unsw.edu.au/sites/default/files/styles/profile/public/images/profile/ID_picture.jpg?itok=dEOUHrcH" height=180px>&lt;/p>
&lt;p>Dr. Prosha Rahman
&lt;img src="https://acems.org.au/sites/default/files/styles/profile/public/images/profiles/73229086_2410000075794867_8865843119296348160_n.jpg?itok=Fcgq48tQ" height=180px>&lt;/p>
&lt;!-- Dr. Daniel Hewitt
&lt;img src="https://www.unsw.edu.au/content/dam/images/photos/people/headshots/science/2023-01-hdr-student-profiles/2022-12_Daniel%20Hewitt.jpg" height=200px> -->
&lt;/div>
&lt;div class="col1">
&lt;p>Dr. Anikó Tóth
&lt;img src="https://api.research.unsw.edu.au/sites/default/files/images/profile/10548779_2720616785933_941271344440781445_o.jpg" height=180px>&lt;/p>
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="contacts-and-socials">Contacts and Socials&lt;/h1>
&lt;a href="https://www.unsw.edu.au/research/udash">
&lt;i class="fas fa-solid fa-home">&lt;/i>
&lt;/a>
&lt;a href="mailto:uDASH@unsw.edu.au" aria-label="envelope">
&lt;i class="fas fa-envelope big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://twitter.com/uDASH_UNSW" target="_blank" rel="noopener" aria-label="twitter">
&lt;i class="fab fa-twitter big-icon">&lt;/i>
&lt;/a>
&lt;p>For urgent matters find my &lt;i class="fas fa-solid fa-phone">&lt;/i> on &lt;i class="fab fa-brands fa-slack">&lt;/i> (Slack)&lt;/p>
&lt;hr>
&lt;h1 id="break">BREAK&lt;/h1>
&lt;p>Quick stretch, walk around, switch tables&lt;/p>
&lt;p>What are you most excited for during the coming week?&lt;/p>
&lt;p>I go on a hike, and everytime I spot a cockatoo, I note down the temperature and atmospheric pressure. Can I use this data to investigate the relationship between temperature and pressure?&lt;/p>
&lt;hr>
&lt;h1 id="what-is-data-science">What is data science?&lt;/h1>
&lt;!-- [Survey](https://www.menti.com/alsx726rihzo) -->
&lt;p>&lt;a href="https://www.menti.com/alpf4g89qdb2" target="_blank" rel="noopener">Survey&lt;/a>&lt;/p>
&lt;hr>
&lt;!-- &lt;div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'>&lt;iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/al897dcp2ng8s9vxfy74xuo7vjysb26s/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'>&lt;/iframe>&lt;/div> -->
&lt;div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'>&lt;iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/alepuy44anib628ejnv6poik6k8waivk/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'>&lt;/iframe>&lt;/div>
&lt;hr>
&lt;h1 id="get-inspired">Get inspired!&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>Go to &lt;a href="https://www.historyofdatascience.com/families/discover-data-science-icons/" target="_blank" rel="noopener">//historyofdatascience.com&lt;/a> and browse some profiles&amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Or scroll through &lt;a href="https://timeline.historyofdatascience.com/" target="_blank" rel="noopener">the timeline&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Who is your favourite data science icon?&lt;/p>
&lt;ul>
&lt;li>an 18th century pioneer?&lt;/li>
&lt;li>a data revolutionary? a data hero?&lt;/li>
&lt;li>an artificial intelligence Jedi?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;section data-background-image="https://miro.medium.com/max/720/1*T5GfsoZ-IWK3rcVkZ7R2bw.png" data-background-size="contain">
&lt;/section>
&lt;hr>
&lt;section>
&lt;h1 id="the-four-paradigms-of-research">The four paradigms of research&lt;/h1>
&lt;hr>
&lt;h1 id="paradigm-1-experimentation">Paradigm 1: Experimentation&lt;/h1>
&lt;p>Father of Modern science&lt;/p>
&lt;img src="https://www.pioneeringminds.com/wp-content/uploads/2018/10/galileo-galilei-819977-1440x960.jpg" style="border-radius: 50%">
&lt;hr>
&lt;h1 id="paradigm-2-theory-led-experimentation">Paradigm 2: Theory led experimentation&lt;/h1>
&lt;img src="https://scitechdaily.com/images/Event-Horizon-Telescope-Black-Hole-Image.jpg">
&lt;hr>
&lt;h1 id="paradigm-3-numerical-modelling">Paradigm 3: Numerical modelling&lt;/h1>
&lt;embed src="https://earth.nullschool.net" width=90% height=500px>
&lt;hr>
&lt;h1 id="paradigm-4-data-intensive-scientific-discovery">Paradigm 4: Data-intensive scientific discovery&lt;/h1>
&lt;p>The concept of these four paradigms of research was coined by Jim Gray, a 1998 Turing Award winner, in 2007.&lt;/p>
&lt;/section>
&lt;hr>
&lt;h1 id="data-science-is-more-than-just-prediction">Data science is more than just prediction&lt;/h1>
&lt;hr>
&lt;h1 id="introduction-to-programming-with-r-and-rstudio">Introduction to programming with R and Rstudio&lt;/h1></description></item><item><title>GPCC_colloquium_2023</title><link>https://steefancontractor.github.io/slides/gpcc_colloquium_2023/</link><pubDate>Fri, 07 Jul 2023 23:47:42 +1000</pubDate><guid>https://steefancontractor.github.io/slides/gpcc_colloquium_2023/</guid><description>&lt;link rel="stylesheet" href="reveal_custom.css">
&lt;link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
&lt;div class="vert-banner">
&lt;img src="img/UNSW_logo-portrait-light_transparent.png">
&lt;img src="img/Udash logo CMYK revised-03.png">
&lt;/div>
&lt;script>
function add_vert_banner() {
let vertbanner = document.querySelector("div.vert-banner");
let reveal = document.querySelector(".reveal");
reveal.insertBefore(vertbanner, reveal.firstChild);
}
window.onload = add_vert_banner();
&lt;/script>
&lt;h2 id="rainfall-estimates-on-a-gridded-network-regen">Rainfall Estimates on a Gridded Network (REGEN)&lt;/h2>
&lt;p>Steefan Contractor&lt;/p>
&lt;br/>
&lt;div class="citation"">Contractor, S., Donat, M. G., Alexander, L. V., Ziese, M., Meyer-Christoffer, A., Schneider, U., Rustemeier, E., Becker, A., Durre, I., and Vose, R. S.: Rainfall Estimates on a Gridded Network (REGEN) – a global land-based gridded dataset of daily precipitation from 1950 to 2016, Hydrol. Earth Syst. Sci., 24, 919–943&lt;div>
&lt;hr>
&lt;h1 id="basic-description">Basic description&lt;/h1>
&lt;ul>
&lt;li>Daily estimates over 1950 - 2016&lt;/li>
&lt;li>Gridded 1 degree latitude x 1 degree longitude resolution&lt;/li>
&lt;li>Global land coverage&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="purpose">Purpose&lt;/h1>
&lt;ul>
&lt;li>Purpose built for climate studies with a long temporal record and consistent global spatial analysis&lt;/li>
&lt;li>Based on a large in situ archive from combining GPCC with GHCN-Daily among others&lt;/li>
&lt;li>Includes various statistical model error estimates&lt;/li>
&lt;li>Also includes guidance for users less aware of issues with in situ based precipitation observations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="in-situ-station-archive-description">In Situ Station Archive Description&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col2">
&lt;br/>&lt;br/>
&lt;ul>
&lt;li>Total stations: 135,178&lt;/li>
&lt;li>Around 50K stations for each day&lt;/li>
&lt;li>Min stations per day: 35,460&lt;/li>
&lt;li>Max stations per day: 56,190&lt;/li>
&lt;/ul>
&lt;/div>
&lt;div class="col3">
&lt;img src="img/num_stations_by_source.jpg" width="300px">
&lt;img src="img/station_locations.jpg">
&lt;/div>
&lt;hr>
&lt;h1 id="component-in-situ-archives">Component In Situ Archives&lt;/h1>
&lt;div class="multi-column" style="font-size: 60%;">
&lt;div class="col1">
&lt;h5 id="three-sources">Three sources:&lt;/h5>
&lt;ul>
&lt;li>GPCC stations&lt;/li>
&lt;li>GHCN-Daily stations&lt;/li>
&lt;li>Collected during GEWEX workshops&lt;/li>
&lt;/ul>
&lt;/div>
&lt;div class="col1">
&lt;h5 id="merging-algorithm">Merging algorithm:&lt;/h5>
&lt;ul>
&lt;li>Lat + Lon match and&lt;/li>
&lt;li>World Met. Org. (WMO) ID match or missing&lt;/li>
&lt;/ul>
&lt;p>Or&lt;/p>
&lt;ul>
&lt;li>Coordinates within 1º of each other and&lt;/li>
&lt;li>WMO ID match or missing and&lt;/li>
&lt;li>0.99 correlation between timeseries with 365 days of data of which at least 10d with &amp;gt;1mm precip&lt;/li>
&lt;/ul>
&lt;/div>
&lt;hr>
&lt;div style="font-size:70%">
&lt;h1 id="quality-control-procedures">Quality Control Procedures&lt;/h1>
&lt;ul>
&lt;li>The automated QC procedures were identical to those applied to GHCN-Daily (Durre et al. 2010)&lt;/li>
&lt;li>The procedure included two stages&lt;/li>
&lt;li>Stage 1 does temporal checks
&lt;ul>
&lt;li>multi-day accumulations&lt;/li>
&lt;li>duplicate data within timeseries&lt;/li>
&lt;li>frequent occurance of values&lt;/li>
&lt;li>world record exceedances&lt;/li>
&lt;li>outlier checks&lt;/li>
&lt;li>temporal consistency checks&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Stage 2 does spatial checks
&lt;ul>
&lt;li>checks whether values are consistent with negihbours&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/div>
&lt;br/>
&lt;div class="citation">Durre, I., Menne, M. J., Gleason, B. E., Houston, T. G., and Vose, R. S.: Comprehensive automated quality assurance of daily sur- face observations, J. Appl. Meteorol. Clim., 49, 1615–1633&lt;/div>
&lt;hr>
&lt;div style="font-size: 50%;">
&lt;h1 id="interpolation-algorithm">Interpolation Algorithm&lt;/h1>
&lt;ul>
&lt;li>Ordinary Block Kirging&lt;/li>
&lt;li>Best Linear Unbiassed Estimator (BLUE)&lt;/li>
&lt;li>Linear because the estimate is a weighted average of surrounding stations&lt;/li>
&lt;/ul>
&lt;p>$$\mathbf{Z}^*(s_0) = \sum_{i=0}^{N} λ_i\mathbf{Z}(S_i)$$&lt;/p>
&lt;ul>
&lt;li>Best because we use the spatial structure (covariance) to determine the value of the weights&lt;/li>
&lt;li>Unbiassed because the weights are constrained to add up to 1 and so the result cannot be biassed to any one station&lt;/li>
&lt;/ul>
&lt;p>$$\sum_{i=1}^N λ = 1$$&lt;/p>
&lt;ul>
&lt;li>Ordinary Kriging assumes second order stationarity (mean and variance constant across domain)&lt;/li>
&lt;/ul>
&lt;p>$$\mathbf{Z}^*(s_0) = μ + ε(s_0)$$&lt;/p>
&lt;ul>
&lt;li>Block implies that the algorithm produces gridded area-average estimates as opposed to point estimates&lt;/li>
&lt;/ul>
&lt;/div>
&lt;hr>
&lt;h1 id="two-flavours-all-stations-and-long-term-stations-only">Two Flavours: All stations and Long Term Stations Only&lt;/h1>
&lt;ul>
&lt;li>The All stations based dataset interpolates all underlying stations&lt;/li>
&lt;li>The Long Term version interpolates only stations with 40 complete years of data&lt;/li>
&lt;li>A year is complete if all 12 months had at least 70% non-missing days&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="uncertainty-aware-guidance-for-users">Uncertainty aware guidance for users&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1" style="font-size: 90%">
&lt;br/>
&lt;ul>
&lt;li>The uncertainty info includes Kriging Error (KE): a weighted average of modeled variance (between interpolation location and stations) and depends solely on the spatial distribution of stations and grid size, and&lt;/li>
&lt;li>Yamamoto coefficient of variation (CV) (Yamamoto et al. 2000): weighted (by Kriging weights) average error between the estimate and the station values&lt;/li>
&lt;li>Number of stations used for each grid estimate is also included&lt;/li>
&lt;/ul>
&lt;/div>
&lt;div class="col1">
&lt;img src="img/REGEN_AvgKriginError.jpg" width=90%>
&lt;img src="img/REGEN_CV.jpg" width=90%>
&lt;/div>
&lt;/div>
&lt;div class="citation">Yamamoto, J. K.: An Alternative Measure of the Reliability of Ordinary Kriging Estimates, Math. Geol., 32, 489–509&lt;/div>
&lt;hr>
&lt;h1 id="quality-masks">Quality Masks&lt;/h1>
&lt;p style="text-align: left;">A grid cell was left unmasked if:
&lt;ul>
&lt;li>It contained 60% of days in every decade with at least 1 station, and&lt;/li>
&lt;li>both the KE and CV were under the 95th percentile (spatial distribution) of the temporally averaged (over 1950 - 2016) KE and CV respectively&lt;/li>
&lt;/ul>
&lt;img src="img/REGEN_REGEN40_qualitymasks.jpg">
&lt;hr>
&lt;h1 id="comparison-with-other-global-gridded-datasets-of-monthly-precipitation">Comparison with other global gridded datasets of monthly precipitation&lt;/h1>
&lt;img src="img/AnnualPrcpTotAnomalyTS_REGEN_GPCC_GHCN_CRU.png" width=80%>
&lt;hr>
&lt;h1 id="comparison-with-other-global-gridded-datasets-of-daily-precipitation">Comparison with other global gridded datasets of daily precipitation&lt;/h1>
&lt;img src="img/REGENvsGPCCFDDandCPC.jpg" width=75%>
&lt;hr>
&lt;div class="top">
&lt;h1 id="comparison-with-regional-daily-precipitation-datasets">Comparison with regional daily precipitation datasets&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;p>Mean difference&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;p>SD of difference&lt;/p>
&lt;/div>
&lt;div class="col1">
&lt;p>Temporal correlation&lt;/p>
&lt;/div>
&lt;/div>
&lt;div id="imgfrag">
&lt;span class="fragment " >
&lt;img src="img/REGENvsCPCCONUS.jpg">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="img/REGENvsEOBS.jpg">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="img/REGENvsAWAP.jpg">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="img/REGENvsAPHRODITE.jpg">
&lt;/span>
&lt;span class="fragment " >
&lt;img src="img/REGENvsSAOBS.jpg">
&lt;/span>
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="application-global-changes-in-precipitation">Application: Global changes in precipitation&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;p>Trends in annual precipitation (1950 - 2016) (mm/yr)&lt;/p>
&lt;img src="img/trends.jpg" width=100%>
&lt;div class="citation">Contractor, S., Donat, M. G., &amp; Alexander, L. V. (2021). Changes in Observed Daily Precipitation over Global Land Areas since 1950. Journal of Climate, 34(1), 3–19.&lt;/div>
&lt;/div>
&lt;div class="col1">
&lt;p style="font-size: 50%">Wet-day frequency changes between 1950-1983 and 1984-2016 (%)
&lt;img src="img/frequency_changes.jpg" width=80%>
&lt;p style="font-size: 50%">Mean precipitation intensity changes between 1950-1983 and 1984-2016 (%)
&lt;img src="img/intensity_changes.jpg" width=80%>
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="changes-across-the-precipitation-distribution-between-1950-1983-and-1984-2016">Changes across the precipitation distribution between 1950-1983 and 1984-2016&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;ul>
&lt;li>Spatially, changes in precipitation seem complex, even stochastic at first&lt;/li>
&lt;li>But a clear signal of positive precipitation changes in the high quantiles consitent with thermodynamic expectations is apparent&lt;/li>
&lt;li>This signal dissappears for the most extreme precipitation again&lt;/li>
&lt;/ul>
&lt;/div>
&lt;div class="col1">
&lt;p style="font-size: 50%">Relative difference in area showing postive changes vs area showing negative changes
&lt;img src="img/distribution_changes.jpg">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="synchronous-changes-between-frequency-and-intensity">Synchronous changes between frequency and intensity&lt;/h1>
&lt;div class="multi-column">
&lt;div class="col1">
&lt;ul>
&lt;li>Mean changes in frequency and intensity are aligned in only around 1/3&lt;sup>rd&lt;/sup> of the grids&lt;/li>
&lt;li>Extreme changes in frequency and intensity are aligned in almost 80% of areas globally&lt;/li>
&lt;/ul>
&lt;/div>
&lt;div class="col1">
&lt;img src="img/freqchanges_vs_intchanges.jpg">
&lt;/div>
&lt;/div>
&lt;hr>
&lt;h1 id="my-ideal-future-of-climate-datasets">(My ideal) Future of climate datasets&lt;/h1>
&lt;div style="font-size:80%">
&lt;ul>
&lt;li>All &amp;ldquo;observational&amp;rdquo; datasets are estimates from a statistical model consisting of aleatoric and epistemic uncertainties&lt;/li>
&lt;li>If we stop thinking of observations as immutable facts and instead think of them as data generating models than we can ask more meaningful questions&lt;/li>
&lt;li>E.g. for validation studies, instead of doing a grid cell by grid cell comparison we can calculate the conditional probability of the model output given the observations&lt;/li>
&lt;li>To do this we need observations to be inherently probabilistic (the entire distribution), e.g. Risser et al. 2019&lt;/li>
&lt;li>Artificial intelligence assisted inference can alleviate computational bottlenecks that traditionally made inference algorithms impractical in climate sciences, e.g. Zammit-Mangion et. al 2021 and Lenzi et. al 2023&lt;/li>
&lt;li>As the examples demonstrate even a dataset of extremes is possible with this approach&lt;/li>
&lt;/ul>
&lt;/div>
&lt;br/>
&lt;div class="citation">Risser, M. D., Paciorek, C. J., Wehner, M. F., O’Brien, T. A., &amp; Collins, W. D. (2019). A probabilistic gridded product for daily precipitation extremes over the United States. Climate Dynamics, 53(5), 2517–2538.&lt;/div>
&lt;div class="citation">Zammit-Mangion, A., Ng, T. L. J., Vu, Q., &amp; Filippone, M. (2021). Deep Compositional Spatial Models. Journal of the American Statistical Association, 0(0), 1–47.&lt;/div>
&lt;div class="citation">Lenzi, A., Bessac, J., Rudi, J., &amp; Stein, M. L. (2023). Neural networks for parameter estimation in intractable models. Computational Statistics &amp; Data Analysis, 185, 107762.&lt;/div>
&lt;hr>
&lt;h1 id="thank-you">Thank you&lt;/h1>
&lt;img style="border-radius:50%" src="https://s.gravatar.com/avatar/f5f75fa059f22f822b95edc56c68930873203f6816d9d0864a75d82b2453e52f?s=250">
&lt;br/>
&lt;a href="mailto:s.contractor@unsw.edu.au" aria-label="envelope">
&lt;i class="fas fa-envelope big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://twitter.com/stefancontracto" target="_blank" rel="noopener" aria-label="twitter">
&lt;i class="fab fa-twitter big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://scholar.google.co.uk/citations?user=sEnHZ3AAAAAJ" target="_blank" rel="noopener" aria-label="google-scholar">
&lt;i class="ai ai-google-scholar big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://github.com/steefancontractor" target="_blank" rel="noopener" aria-label="github">
&lt;i class="fab fa-github big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://www.linkedin.com/in/steefan-contractor-b375bb209/" target="_blank" rel="noopener" aria-label="linkedin">
&lt;i class="fab fa-linkedin big-icon">&lt;/i>
&lt;/a>
&lt;a href="https://mastodon.au/@stefancontracto" target="_blank" rel="noopener" aria-label="mastodon">
&lt;i class="fab fa-mastodon big-icon">&lt;/i>
&lt;/a></description></item></channel></rss>