python - 找到包含 0 个样本 (shape=(0, 40)) 的数组，而至少需要 1 个

coder 2023-08-21 原文

我正在使用 Python 2.7、sklearn 0.17.1、numpy 1.11.0 测试一个简单的预测程序。我从 LDA 模型中获得了概率矩阵，现在我想创建 RandomForestClassifier 以通过概率预测结果。我的代码是:

maxlen = 40
props = []
for doc in corpus:
    topics = model.get_document_topics(doc) 
    tprops = [0] * maxlen
    for topic in topics:
        tprops[topics[0]] = topics[1]
    props.append(tprops)

ntheta = np.array(props)
ny = np.array(y)

clf = RandomForestClassifier(n_estimators=100)
accuracy = cross_val_score(clf, ntheta, ny, scoring = 'accuracy')
print accuracy

ValueError                                Traceback (most recent call last)
<ipython-input-65-a7d276df43e9> in <module>()
      1 # clf.fit(nteta, ny)
      2 print nteta.shape, ny.shape
----> 3 accuracy = cross_val_score(clf, nteta, ny, scoring = 'accuracy')
      4 print accuracy

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/cross_validation.pyc in cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch)
   1431                                               train, test, verbose, None,
   1432                                               fit_params)
-> 1433                       for train, test in cv)
   1434     return np.array(scores)[:, 0]
   1435 

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    798             # was dispatched. In particular this covers the edge
    799             # case of Parallel used with an exhausted iterator.
--> 800             while self.dispatch_one_batch(iterator):
    801                 self._iterating = True
    802             else:

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch_one_batch(self, iterator)
    656                 return False
    657             else:
--> 658                 self._dispatch(tasks)
    659                 return True
    660 

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in _dispatch(self, batch)
    564 
    565         if self._pool is None:
--> 566             job = ImmediateComputeBatch(batch)
    567             self._jobs.append(job)
    568             self.n_dispatched_batches += 1

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __init__(self, batch)
    178         # Don't delay the application, to avoid keeping the input
    179         # arguments in memory
--> 180         self.results = batch()
    181 
    182     def get(self):

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
     73 
     74     def __len__(self):

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, error_score)
   1529             estimator.fit(X_train, **fit_params)
   1530         else:
-> 1531             estimator.fit(X_train, y_train, **fit_params)
   1532 
   1533     except Exception as e:

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/ensemble/forest.pyc in fit(self, X, y, sample_weight)
    210         """
    211         # Validate or convert input data
--> 212         X = check_array(X, dtype=DTYPE, accept_sparse="csc")
    213         if issparse(X):
    214             # Pre-sort indices to avoid that each individual tree of the

/home/egor/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    405                              " minimum of %d is required%s."
    406                              % (n_samples, shape_repr, ensure_min_samples,
--> 407                                 context))
    408 
    409     if ensure_min_features > 0 and array.ndim == 2:

ValueError: Found array with 0 sample(s) (shape=(0, 40)) while a minimum of 1 is required.

更新为什么我得到 2 减？让批评者具有建设性。

更新

cotique发现y填错了(肯定是其他类)。如果 y 填写正确，那么问题就不会发生。在我的例子中，类是错误的，它们的数量是 39774。但理论上这不是答案，为什么当我们有 39774 个类并且必须预测它们时会发生错误。

最佳答案

这是来自 scikit-learn 存储库 (validation.py#L409) 的原始代码:

if ensure_min_samples > 0:
   n_samples = _num_samples(array)
   if n_samples < ensure_min_samples:
      raise ValueError("Found array with %d sample(s) (shape=%s) while a"
                       " minimum of %d is required%s."
                        % (n_samples, shape_repr, ensure_min_samples,
                        context))

因此，n_samples = _num_samples(array)。顺便说一下，数组 是要检查/转换的输入对象。

接下来，validation.py#L111 :

def _num_samples(x):
    """Return number of samples in array-like x."""
    if hasattr(x, 'fit'):
        # stuff
    if not hasattr(x, '__len__') and not hasattr(x, 'shape'):
        # stuff
    if hasattr(x, 'shape'):
        if len(x.shape) == 0:
            # raise TypeError
        return x.shape[0]
    else:
        return len(x)

因此，样本数等于 array 第一维的长度，即 0 因为 array.shape = (0, 40).

我不知道这一切意味着什么，但我希望它能让事情变得更清楚。

关于python - 找到包含 0 个样本 (shape=(0, 40)) 的数组，而至少需要 1 个，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37632550/

有关python - 找到包含 0 个样本 (shape=(0, 40)) 的数组，而至少需要 1 个的更多相关文章

python - 如何使用 Ruby 或 Python 创建一系列高音调和低音调的蜂鸣声？ - 2
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题？更新问题，以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点？我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等)，但我确实想创建一个输出文件。
ruby - 我需要将 Bundler 本身添加到 Gemfile 中吗？ - 2
当我使用Bundler时，是否需要在我的Gemfile中将其列为依赖项？毕竟，我的代码中有些地方需要它。例如，当我进行Bundler设置时:require"bundler/setup" 最佳答案没有。您可以尝试，但首先您必须用鞋带将自己抬离地面。关于ruby-我需要将Bundler本身添加到Gemfile中吗？，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.com/questions/4758609/
ruby-on-rails - 在 Ruby 中循环遍历多个数组 - 2
我有多个ActiveRecord子类Item的实例数组，我需要根据最早的事件循环打印。在这种情况下，我需要打印付款和维护日期，如下所示:ItemAmaintenancerequiredin5daysItemBpaymentrequiredin6daysItemApaymentrequiredin7daysItemBmaintenancerequiredin8days我目前有两个查询，用于查找maintenance和payment项目(非排他性查询)，并输出如下内容:paymentrequiredin...maintenancerequiredin...有什么方法可以改善上述(丑陋的)代
ruby - 多次弹出/移动 ruby 数组 - 2
我的代码目前看起来像这样numbers=[1,2,3,4,5]defpop_threepop=[]3.times{pop有没有办法在一行中完成pop_three方法中的内容？我基本上想做类似numbers.slice(0,3)的事情，但要删除切片中的数组项。嗯...嗯，我想我刚刚意识到我可以试试slice! 最佳答案是numbers.pop(3)或者numbers.shift(3)如果你想要另一边。关于ruby-多次弹出/移动ruby数组，我们在StackOverflow上找到一
ruby - 将数组的内容转换为 int - 2
我需要读入一个包含数字列表的文件。此代码读取文件并将其放入二维数组中。现在我需要获取数组中所有数字的平均值，但我需要将数组的内容更改为int。有什么想法可以将to_i方法放在哪里吗？ClassTerraindefinitializefile_name@input=IO.readlines(file_name)#readinfile@size=@input[0].to_i@land=[@size]x=1whilex 最佳答案只需将数组映射为整数:@land边注如果你想得到一条线的平均值，你可以这样做:values=@input[x]
ruby - 检查 "command"的输出应该包含 NilClass 的意外崩溃 - 2
为了将Cucumber用于命令行脚本，我按照提供的说明安装了arubagem。它在我的Gemfile中，我可以验证是否安装了正确的版本并且我已经包含了require'aruba/cucumber'在'features/env.rb'中为了确保它能正常工作，我写了以下场景:@announceScenario:Testingcucumber/arubaGivenablankslateThentheoutputfrom"ls-la"shouldcontain"drw"假设事情应该失败。它确实失败了，但失败的原因是错误的:@announceScenario:Testingcucumber/ar
ruby - 通过 erb 模板输出 ruby 数组 - 2
我正在使用puppet为ruby程序提供一组常量。我需要提供一组主机名，我的程序将对其进行迭代。在我之前使用的bash脚本中，我只是将它作为一个puppet变量hosts=>"host1,host2"我将其提供给bash脚本作为HOSTS=显然这对ruby不太适用——我需要它的格式hosts=["host1","host2"]自从phosts和putsmy_array.inspect提供输出["host1","host2"]我希望使用其中之一。不幸的是，我终其一生都无法弄清楚如何让它发挥作用。我尝试了以下各项:我发现某处他们指出我需要在函数调用前放置“function_”……这
ruby - 检查数组是否在增加 - 2
这个问题在这里已经有了答案:Checktoseeifanarrayisalreadysorted?(8个答案)关闭9年前。我只是想知道是否有办法检查数组是否在增加？这是我的解决方案，但我正在寻找更漂亮的方法:n=-1@arr.flatten.each{|e|returnfalseife
ruby - rspec 需要 .rspec 文件中的 spec_helper - 2
我注意到像bundler这样的项目在每个specfile中执行requirespec_helper我还注意到rspec使用选项--require，它允许您在引导rspec时要求一个文件。您还可以将其添加到.rspec文件中，因此只要您运行不带参数的rspec就会添加它。使用上述方法有什么缺点可以解释为什么像bundler这样的项目选择在每个规范文件中都需要spec_helper吗？最佳答案我不在Bundler上工作，所以我不能直接谈论他们的做法。并非所有项目都checkin.rspec文件。原因是这个文件，通常按照当前的惯例，只
ruby - 如何在 Lion 上安装 Xcode 4.6，需要用 RVM 升级 ruby - 2
我实际上是在尝试使用RVM在我的OSX10.7.5上更新ruby，并在输入以下命令后:rvminstallruby我得到了以下回复:Searchingforbinaryrubies,thismighttakesometime.Checkingrequirementsforosx.Installingrequirementsforosx.Updatingsystem.......Errorrunning'requirements_osx_brew_update_systemruby-2.0.0-p247',pleaseread/Users/username/.rvm/log/138121

python - 找到包含 0 个样本 (shape=(0, 40)) 的数组，而至少需要 1 个

有关python - 找到包含 0 个样本 (shape=(0, 40)) 的数组，而至少需要 1 个的更多相关文章

随机推荐