String Algorithms in C: Efficient Text Representation and Search
- 3h 36m
- Thomas Mailund
- Apress
- 2020
Implement practical data structures and algorithms for text search and discover how it is used inside other larger applications. This unique in-depth guide explains string algorithms using the C programming language. String Algorithms in C teaches you the following algorithms and how to use them: classical exact search algorithms; tries and compact tries; suffix trees and arrays; approximative pattern searches; and more.
In this book, author Thomas Mailund provides a library with all the algorithms and applicable source code that you can use in your own programs. There are implementations of all the algorithms presented in this book so there are plenty of examples.
You’ll understand that string algorithms are used in various applications such as image processing, computer vision, text analytics processing from data science to web applications, information retrieval from databases, network security, and much more.
What You Will Learn
- Use classical exact search algorithms including naive search, borders/border search, Knuth-Morris-Pratt, and Boyer-Moor with or without Horspool
- Search in trees, use tries and compact tries, and work with the Aho-Carasick algorithm
- Process suffix trees including the use and development of McCreight’s algorithm
- Work with suffix arrays including binary searches; sorting naive constructions; suffix tree construction; skew algorithms; and the Borrows-Wheeler transform (BWT)
- Deal with enhanced suffix arrays including longest common prefix (LCP)
- Carry out approximative pattern searches among suffix trees and approximative BWT searches
Who This Book Is For
Those with at least some prior programming experience with C or Assembly and have at least prior experience with programming algorithms.
About the Author
Thomas Mailund is an associate professor in bioinformatics at Aarhus University, Denmark. He has a background in math and computer science, including experience programming and teaching in the C and R programming languages. For the last decade, his main focus has been on genetics and evolutionary studies, particularly comparative genomics, speciation, and gene flow between emerging species.
In this Book
-
Introduction
-
Classical Algorithms for Exact Search
-
Suffix Trees
-
Suffix Arrays
-
Approximate Search
-
Conclusions