Project 2 CompSci program, deadline May 31

CompSci program
Department of Physics, University of Oslo, Norway

Feb 21, 2023


Paths for project 2

Possible paths for project 2

We discuss here several paths as well as data sets for the second project.

  1. The computational path: Here we propose a path where you develop your own code for a neural networks (or CNNs or RNNs) and apply this to data of your own selection. The code should be object oriented and flexible allowing for eventual extensions by including different Loss/Cost functions and other functionalities. Feel free to select data sets from those suggested below here. This code can also be extended upon by adding for example autoencoders. You can compare your own codes with implementations using TensorFlow(Keras)/PyTorch or other libraries.
  2. The differential equation path: Here we propose a set of differential equations (ordinary and/or partial) to be solved first using neural networks (using either your own code or TensorFlow/Pytorch or similar libraries). Thereafter we plan to extend the set of methods for solving these equations to recurrent neural networks and autoencoders. All these approaches can be expanded into one large project. This project can also be extended into including Physics informed machine learning. Here we can discuss neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations.
  3. The application path: Here you can use the most relevant method(s) (say neural networks, convolutional neural networks for images) and apply this(these) to data sets relevant for your own research.
  4. And finally we propose also a partial differential equation path.

You can propose own data sets that relate to your research interests or just use existing data sets from say

  1. Kaggle
  2. The University of California at Irvine (UCI) with its machine learning repository.
  3. For the differential equation problems, you can generate your own datasets, as described below.
  4. If possible, you should link the data sets with existing research and analyses thereof. Scientific articles which have used Machine Learning algorithms to analyze the data are highly welcome. Perhaps you can improve previous analyses and even publish a new article?
  5. A critical assessment of the methods with ditto perspectives and recommendations is also something you need to include.

The approach to the analysis of these new data sets should follow to a large extent what you did in project 1. That is:

  1. Whether you end up with a regression or a classification problem, you should employ at least two of the methods we have discussed among linear regression (including Ridge and Lasso), Logistic Regression, Neural Networks, Convolution Neural Networks, Recurrent Neural Networks, Adversarial Neural Networks, Support Vector Machines and Decision Trees, Random Forests, Bagging and Boosting.
  2. The estimates you used and tested in project 1 should also be included, that is the \( R2 \)-score, MSE, confusion matrix, accuracy score, information gain, ROC and Cumulative gains curves and other, cross-validation and/or bootstrap if these are relevant.
  3. Similarly, feel free to explore various activations functions in deep learning and various approachs to stochastic gradient descent approaches.

All in all, the report should follow the same pattern as project 1, with abstract, introduction, methods, code, results, conclusions etc.