Home > Events > Oral Candidacy - Arturo Argueta

Oral Candidacy - Arturo Argueta

Start: 2/7/2018 at 10:30AM
End: 2/7/2018 at 2:00PM
Location: 257G Fitzpatrick
Add to calendar:
iCal vCal

Arturo Argueta

Oral Candidacy

February 7, 2018        10:30 am        257G Fitzpatrick

Adviser:

Dr. David Chiang

Committee Members;

Dr. Adam Lopez        Dr. Michael Niemier        Dr. Tim Weninger

Title:

"Accelerating Natural Language Processing algorithms using Graphics Processing Units"

Abstract:

Natural Language Processing (NLP) focuses on the different techniques computers use to understand and interpret human languages. NLP covers a wide range of sub-topics such as syntax (analyzing if words in an utterance are well arranged), semantics (understanding the meaning of combined words), and discourse. Most state-of-the-art NLP systems feed large amounts of natural language text into different models for training. A problem with the large quantity of text used for training is that the probabilities to predict words that do not appear during training are equal to zero. It is costly to perform computations on data structures containing many zeros, since a large amount of the computing time is spent on cost-inefficient zero-by-zero operations. Ideally, the entire computation time should be spent on  non-zero elements only and computation time on elements with zero probabilities should be minimized.

Graphics Processing Units (GPU) are widely used to process a large quantity of operations in a short amount of time.  A problem with the use of these accelerators is that not all problems can be parallelized, and not all parallel adaptations run faster than a serial CPU one. Using GPUs to process large data structures with many zero elements poses a problem. A large part of the computation time will be spent on ineffective operations on zero elements if the parallel implementations do not take advantage of the sparse properties of the input. Previous work uses the full sparse structures for computation and generic off-the-shelf APIs to process the input efficiently.  Speedups can be achieved if the parallel implementation is tailored to the sparsity pattern of the problem being solved. An overview of sparse problems in NLP will be covered as well as computational methods used in NLP and concepts in High Performance Computing (HPC) that can be adapted to run efficiently on NLP systems.