|
APPLICATIONS OF TECHNOLOGY:
ADVANTAGES:
|
|
ABSTRACT: Berkeley Lab scientists Sung-Hou Kim and Gregory E. Sims have developed a computational method that compares, categorizes and indexes objects that contain information, according to content. Objects with linear or linearizable information, including books, genetic codes and digitized audio or video recordings are compared based on the frequency of certain predefined features, such as a string of letters or numbers. The Berkeley Lab technology provides eigenvalues and eigenvectors, which convey not only the degrees of difference or similarity between content but also identify the characteristics that make certain objects different or similar. In addition, the eigenvalues provide an objective and simple way of indexing. The relationships then can be depicted with a multidimensional matrix or diagrammatic tree. As a result, the new method surpasses techniques in which certain characteristics or segments of data are subjectively chosen to analyze for grouping objects. For example, indices and website search engines rely on the presence or absence of specific keywords, traffic and the frequency at which sites are accessed. Existing textual comparisons, such as those used to determine plagiarism, depend on the frequency of certain words, yet cannot account for the ordering or syntax of words. Comparisons of genetic code used to classify organisms and identify targets for new medications rely on the alignment of a tiny fraction of subjectively chosen DNA (1% or less). The Berkeley Lab technology has been tested in several venues. It categorized works of literature by genre, using the books’ content, more accurately than the traditional method based on word frequency. In another demonstration, the whole genomes of mammals were compared to produce a classification tree that matched the established phylogeny based on morphology. The technology was also used to produce phylogenetic trees of bacteria and viruses, which led to the classification of previously unclassified genomes. |
|
STATUS:
|
To learn more about licensing a technology from LBNL see http://www.lbl.gov/Tech-Transfer/licensing/index.html. |
|
FOR MORE INFORMATION: |
|
REFERENCE NUMBER: IB-2597 |
|
SEE THESE OTHER BERKELEY LAB TECHNOLOGIES IN THIS FIELD: |
