For my distributed scalable computing course, my group of 2 worked on a project that experimented with the theory behind genetic algorithms and using the CUDA framework to create an application that attempted to find the optimal launch parameters to hit a target location with basic physics consideration of gravity and wind resistance.
In summary, the idea behind the genetic algorithm is that the program aims to minimize a given cost function with a set of parameters ("genes") that influence the cost that can be encapsulated in what is referred to as "individuals". For our problem, these individuals contain values for parameters such as launch angles and initial velocity that after evaluation will have a cost associated to them. With a large group of individuals that makes up a population, the best or most desired individuals are used to create new individuals that need to be evaluated while the worst performing individuals (highest cost) are removed from the pool. Over many iterations (referred to as "generations"), the best individuals should lead to better and better parameters that lead to smaller costs. For our problem, that cost is distance from the target.
The parallel aspect of our project was in using CUDA to optimize the time needed to find a solution, first by having each individual run on a thread to project the resulting final position of the projectile and cost. Once all the individuals have their simulation run, we then pursued having a means of crossing over the best individuals in a parallel way to better increase the efficiency of the program. This was the most experimental aspect of our project, as it involved devising new ways to develop the genetic algorithm that was not attempted in my 2020 Summer fellowship (which had the genetic algorithm done in serial).
While the final result we presented sadly did not achieve a strong improvement in finding a solution, it did demonstrate a more efficient manner of performing the algorithm than in serial and applying them to a physics-based problem.