According to MLU Explain: "Gradient descent is an iterative optimization algorithm that estimates some set of coefficients to yield the minimum of a convex function". Or, in layman's terms, it finds suitable numbers such that our prediction error is minimized.
How does it work?
Assuming we have some convex function that represents the error of a machine learning algorithm (ex: MSE- check Regression if you don't know what I'm talking about). Gradient descent will iteratively update the model's coefficients, but basically numbers, moving them towards the error function's minimum. Basically: This approach will identify the coefficients needed to fit the data.