Java 字典搜索器

coder 2024-03-02 原文

我正在尝试实现一个程序，该程序将接受用户输入，将该字符串拆分为标记，然后在字典中搜索该字符串中的单词。我对解析字符串的目标是让每个标记都是英文单词。

例如:

Input:
       aman

Split Method:
      a man
      a m an
      a m a n
      am an
      am a n
      ama n

Desired Output:
      a man

我目前有这段代码可以完成所有工作，直到所需的输出部分:

    import java.util.Scanner;
import java.io.*;

public class Words {

    public static String[] dic = new String[80368];

    public static void split(String head, String in) {

        // head + " " + in is a segmentation 
        String segment = head + " " + in;

        // count number of dictionary words
        int count = 0;
        Scanner phraseScan = new Scanner(segment);
        while (phraseScan.hasNext()) {
            String word = phraseScan.next();
            for (int i=0; i<dic.length; i++) {
                if (word.equalsIgnoreCase(dic[i])) count++;
            }
        }

        System.out.println(segment + "\t" + count + " English words");

        // recursive calls
        for (int i=1; i<in.length(); i++) {
            split(head+" "+in.substring(0,i), in.substring(i,in.length()));
        }   
    }

    public static void main (String[] args) throws IOException {
        Scanner scan = new Scanner(System.in);
        System.out.print("Enter a string: ");
        String input = scan.next();
        System.out.println();

        Scanner filescan = new Scanner(new File("src:\\dictionary.txt"));
        int wc = 0;
        while (filescan.hasNext()) {
            dic[wc] = filescan.nextLine();
            wc++;
        }

        System.out.println(wc + " words stored");

        split("", input);

    }
}

我知道有更好的方法来存储字典(例如二叉搜索树或哈希表)，但我不知道如何实现它们。

我被困在如何实现一种方法来检查拆分字符串以查看每个段是否是字典中的单词。

任何帮助都会很棒，谢谢

最佳答案

如果您想支持 20 个或更多字符，以各种可能的方式拆分输入字符串不会在合理的时间内完成。这是一种更有效的方法，内联评论:

public static void main(String[] args) throws IOException {
    // load the dictionary into a set for fast lookups
    Set<String> dictionary = new HashSet<String>();
    Scanner filescan = new Scanner(new File("dictionary.txt"));
    while (filescan.hasNext()) {
        dictionary.add(filescan.nextLine().toLowerCase());
    }

    // scan for input
    Scanner scan = new Scanner(System.in);
    System.out.print("Enter a string: ");
    String input = scan.next().toLowerCase();
    System.out.println();

    // place to store list of results, each result is a list of strings
    List<List<String>> results = new ArrayList<>();

    long time = System.currentTimeMillis();

    // start the search, pass empty stack to represent words found so far
    search(input, dictionary, new Stack<String>(), results);

    time = System.currentTimeMillis() - time;

    // list the results found
    for (List<String> result : results) {
        for (String word : result) {
            System.out.print(word + " ");
        }
        System.out.println("(" + result.size() + " words)");
    }
    System.out.println();
    System.out.println("Took " + time + "ms");
}

public static void search(String input, Set<String> dictionary,
        Stack<String> words, List<List<String>> results) {

    for (int i = 0; i < input.length(); i++) {
        // take the first i characters of the input and see if it is a word
        String substring = input.substring(0, i + 1);

        if (dictionary.contains(substring)) {
            // the beginning of the input matches a word, store on stack
            words.push(substring);

            if (i == input.length() - 1) {
                // there's no input left, copy the words stack to results
                results.add(new ArrayList<String>(words));
            } else {
                // there's more input left, search the remaining part
                search(input.substring(i + 1), dictionary, words, results);
            }

            // pop the matched word back off so we can move onto the next i
            words.pop();
        }
    }
}

示例输出:

Enter a string: aman

a man (2 words)
am an (2 words)

Took 0ms

这是一个更长的输入:

Enter a string: thequickbrownfoxjumpedoverthelazydog

the quick brown fox jump ed over the lazy dog (10 words)
the quick brown fox jump ed overt he lazy dog (10 words)
the quick brown fox jumped over the lazy dog (9 words)
the quick brown fox jumped overt he lazy dog (9 words)

Took 1ms

关于Java 字典搜索器，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/5922956/

有关Java 字典搜索器的更多相关文章

java - 等价于 Java 中的 Ruby Hash - 2
我真的很习惯使用Ruby编写以下代码:my_hash={}my_hash['test']=1Java中对应的数据结构是什么？最佳答案 HashMapmap=newHashMap();map.put("test",1);我假设？关于java-等价于Java中的RubyHash，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.com/questions/22737685/
ruby-on-rails - Nokogiri:使用 XPath 搜索 <div> - 2
我使用Nokogiri(Rubygem)css搜索寻找某些在我的html里面。看起来Nokogiri的css搜索不喜欢正则表达式。我想切换到Nokogiri的xpath搜索，因为这似乎支持搜索字符串中的正则表达式。如何在xpath搜索中实现下面提到的(伪)css搜索？require'rubygems'require'nokogiri'value=Nokogiri::HTML.parse(ABBlaCD3"HTML_END#my_blockisgivenmy_bl="1"#my_eqcorrespondstothisregexmy_eq="\/[0-9]+\/"#FIXMEThefoll
java - 从 JRuby 调用 Java 类的问题 - 2
我正在尝试使用boilerpipe来自JRuby。我看过guide从JRuby调用Java，并成功地将它与另一个Java包一起使用，但无法弄清楚为什么同样的东西不能用于boilerpipe。我正在尝试基本上从JRuby中执行与此Java等效的操作:URLurl=newURL("http://www.example.com/some-location/index.html");Stringtext=ArticleExtractor.INSTANCE.getText(url);在JRuby中试过这个:require'java'url=java.net.URL.new("http://www
java - 我的模型类或其他类中应该有逻辑吗 - 2
我只想对我一直在思考的这个问题有其他意见，例如我有classuser_controller和classuserclassUserattr_accessor:name,:usernameendclassUserController//dosomethingaboutanythingaboutusersend问题是我的User类中是否应该有逻辑user=User.newuser.do_something(user1)oritshouldbeuser_controller=UserController.newuser_controller.do_something(user1,user2)我
java - 什么相当于 ruby 的 rack 或 python 的 Java wsgi？ - 2
什么是ruby的rack或python的Java的wsgi？还有一个路由库。最佳答案来自Python标准PEP333:Bycontrast,althoughJavahasjustasmanywebapplicationframeworksavailable,Java's"servlet"APImakesitpossibleforapplicationswrittenwithanyJavawebapplicationframeworktoruninanywebserverthatsupportstheservletAPI.ht
Observability：从零开始创建 Java 微服务并监控它（二） - 2
这篇文章是继上一篇文章“Observability：从零开始创建Java微服务并监控它（一）”的续篇。在上一篇文章中，我们讲述了如何创建一个Javaweb应用，并使用Filebeat来收集应用所生成的日志。在今天的文章中，我来详述如何收集应用的指标，使用APM来监控应用并监督web服务的在线情况。源码可以在地址 https://github.com/liu-xiao-guo/java_observability 进行下载。摄入指标指标被视为可以随时更改的时间点值。当前请求的数量可以改变任何毫秒。你可能有1000个请求的峰值，然后一切都回到一个请求。这也意味着这些指标可能不准确，你还想提取最小/
【Java 面试合集】HashMap中为什么引入红黑树，而不是AVL树呢 - 2
HashMap中为什么引入红黑树，而不是AVL树呢1.概述开始学习这个知识点之前我们需要知道，在JDK1.8以及之前，针对HashMap有什么不同。JDK1.7的时候，HashMap的底层实现是数组+链表JDK1.8的时候，HashMap的底层实现是数组+链表+红黑树我们要思考一个问题，为什么要从链表转为红黑树呢。首先先让我们了解下链表有什么不好？？？2.链表上述的截图其实就是链表的结构，我们来看下链表的增删改查的时间复杂度增：因为链表不是线性结构，所以每次添加的时候，只需要移动一个节点，所以可以理解为复杂度是N(1)删：算法时间复杂度跟增保持一致查：既然是非线性结构，所以查询某一个节点的时候
微信小程序通过字典表匹配对应数据 - 2
前言一般来说，前端根据后台返回code码展示对应内容只需要在前台判断code值展示对应的内容即可，但要是匹配的code码比较多或者多个页面用到时，为了便于后期维护，后台就会使用字典表让前端匹配，下面我将在微信小程序中通过wxs的方法实现这个操作。为什么要使用wxs？{{method(a,b)}}可以看到，上述代码是一个调用方法传值的操作，在vue中很常见，多用于数据之间的转换，但由于微信小程序诸多限制的原因，你并不能优雅的这样操作，可能有人会说，为什么不用if判断实现呢？但是if判断的局限性在于如果存在数据量过大时，大量重复性操作和if判断会让你的代码显得异常冗余。wxswxs相当于是一个独立
【Java入门】使用Java实现文件夹的遍历 - 2
遍历文件夹我们通常是使用递归进行操作，这种方式比较简单，也比较容易理解。本文为大家介绍另一种不使用递归的方式，由于没有使用递归，只用到了循环和集合，所以效率更高一些！一、使用递归遍历文件夹整体思路1、使用File封装初始目录，2、打印这个目录3、获取这个目录下所有的子文件和子目录的数组。4、遍历这个数组，取出每个File对象4-1、如果File是否是一个文件，打印4-2、否则就是一个目录，递归调用代码实现publicclassSearchFile{publicstaticvoidmain(String[]args){//初始目录Filedir=newFile("d:/Dev");Datebeg
java - 为什么 ruby modulo 与 java/other lang 不同？ - 2
我基本上来自Java背景并且努力理解Ruby中的模运算。(5%3)(-5%3)(5%-3)(-5%-3)Java中的上述操作产生，2个-22个-2但在Ruby中，相同的表达式会产生21个-1-2.Ruby在逻辑上有多擅长这个？模块操作在Ruby中是如何实现的？如果将同一个操作定义为一个web服务，两个服务如何匹配逻辑。最佳答案在Java中，模运算的结果与被除数的符号相同。在Ruby中，它与除数的符号相同。remainder()在Ruby中与被除数的符号相同。您可能还想引用modulooperation.

Java 字典搜索器

有关Java 字典搜索器的更多相关文章

随机推荐