Intresting Journey article from Damien Benveniste on AI / ChatGPT
Warren Weaver was the first to suggest an algorithmic approach to machine translation (MT) in 1949 and this led to the Georgetown experiment, the first computer application to MT in 1955. In 1957, Chomsky established the first grammar theory. ELIZA (1964) and SHRDLU (1968) can be considered to be the first natural-language understanding computer programs.
The 60s and early 70s marked the era of grammar theories. During the 70s, the concept of conceptual ontologies became quite fashionable. Conceptual ontologies are similar to knowledge graphs where concepts are linked to each other by how they are associated. The famous ones are MARGIE (1975), TaleSpin (1976), QUALM (1977), SAM (1978), PAM (1978), Politics (1979) and Plot Units (1981).
The 80s showed a great period of success for symbolic methods. In 1983, Charniak proposed Passing Markers, a mechanism for resolving ambiguities in language comprehension by indicating the relationship between adjacent words. In 1986, Riesbeck and Martin proposed Uniform Parsing, a new approach to natural language processing that combines parsing and inferencing in a uniform framework for language learning. In 1987, Hirst proposed a new approach to resolving ambiguity: Semantic Interpretation.
The 90s saw the advent of statistical models. It was the beginning of thinking about language as a probabilistic process. In 1989, Balh proposed a tree-based method to predict the next word in a sentence. IBM presented a series of models for statistical machine translation. In 1990 Chitrao and Grishman demonstrated the potential of statistical parsing techniques for processing messages and Brill et al introduced a method for automatically inducing a part-of-speech tagger by training on a large corpus of text. In 1991, Brown proposed a method for aligning sentences in parallel corpora for machine translation applications.
In 2003, Bengio proposed the first neural language model, a simple feed-forward model. In 2008, Collobert and Weston applied multi-task learning with ConvNet. In 2011, Hinton built a generative text model with Recurrent Neural Networks. In 2013, Mikolov introduced Word2Vec. In 2014, Sutskever suggested a model for sequence-to-sequence learning. In 2017, Vaswani gave us the Transformer architecture that led to a revolution in model performance. In 2018, Devlin presented BERT that popularized Transformers. And in 2022, we finally got to experience ChatGPT that completely changed the way the public perceived AI.