我正在编写一些代码,从 Python 获取二进制数据,将其通过管道传输到 C++,对数据进行一些处理(在本例中计算互信息度量),然后将结果通过管道传输回 Python。在测试时,我发现如果我发送的数据是一组 2 个尺寸小于 1500 X 1500 的数组,一切正常,但如果我发送 2 个 2K X 2K 的数组,我会得到很多损坏的废话。
我目前认为代码的算法部分很好,因为它在使用小型 (<=1500 x1500)="" 数组进行测试期间提供了预期的答案。这使我相信这是="" stdin="" 或="" stdout="">=1500>
Python代码和C++代码如下。
Python 代码:
import subprocess
import struct
import sys
import numpy as np
#set up the variables needed
bytesPerDouble = 8
sizeX = 2000
sizeY = 2000
offset = sizeX*sizeY
totalBytesPerArray = sizeX*sizeY*bytesPerDouble
totalBytes = totalBytesPerArray*2 #the 2 is because we pass 2 different versions of the 2D array
#setup the testing data array
a = np.zeros(sizeX*sizeY*2, dtype='d')
for i in range(sizeX):
for j in range(sizeY):
a[j+i*sizeY] = i
a[j+i*sizeY+offset] = i
if i % 10 == 0:
a[j+i*sizeY+offset] = j
data = a.tobytes('C')
strTotalBytes = str(totalBytes)
strLineBytes = str(sizeY*bytesPerDouble)
#communicate with c++ code
print("starting C++ code")
command = "C:\Python27\PythonPipes.exe"
proc = subprocess.Popen([command, strTotalBytes, strLineBytes, str(sizeY), str(sizeX)], stdin=subprocess.PIPE,stderr=subprocess.PIPE,stdout=subprocess.PIPE)
ByteBuffer = (data)
proc.stdin.write(ByteBuffer)
print("Reading results back from C++")
for i in range(sizeX):
returnvalues = proc.stdout.read(sizeY*bytesPerDouble)
a = buffer(returnvalues)
b = struct.unpack_from(str(sizeY)+'d', a)
print str(b) + " " + str(i)
print('done')
C++代码: 主要功能:
int main(int argc, char **argv) {
int count = 0;
long totalbytes = stoi(argv[argc-4], nullptr,10); //bytes being transfered
long bytechunk = stoi(argv[argc - 3], nullptr, 10); //bytes being transfered at a time
long height = stoi(argv[argc-2], nullptr, 10); //bytes being transfered at a time
long width = stoi(argv[argc-1], nullptr, 10); //bytes being transfered at a time
long offset = totalbytes / sizeof(double) / 2;
data = new double[totalbytes/sizeof(double)];
int columnindex = 0;
//read in data from pipe
while (count<totalbytes) {
fread(&(data[columnindex]), 1, bytechunk, stdin);
columnindex += bytechunk / sizeof(double);
count += bytechunk;
}
//calculate the data transform
MutualInformation MI = MutualInformation();
MI.Initialize(data, height, width, offset);
MI.calcMI();
count = 0;
//*
//write out data to pipe
columnindex = 0;
while (count<totalbytes/2) {
fwrite(&(MI.getOutput()[columnindex]), 1, bytechunk, stdout);
fflush(stdout);
count += bytechunk;
columnindex += bytechunk/sizeof(double);
}
//*/
delete [] data;
return 0;
}
如果您需要它,实际的处理代码:
double MutualInformation::calcMI(){
double rvalue = 0.0;
std::map<int, map<int, double>> lHistXY = map<int, map<int, double>>();
std::map<int, double> lHistX = map<int, double>();
std::map<int, double> lHistY = map<int, double>();
typedef std::map<int, std::map<int, double>>::iterator HistXY_iter;
typedef std::map<int, double>::iterator HistY_iter;
//calculate Entropys and MI
double MI = 0.0;
double Hx = 0.0;
double Hy = 0.0;
double Px = 0.0;
double Py = 0.0;
double Pxy = 0.0;
//scan through the image
int ip = 0;
int jp = 0;
int chipsize = 3;
//setup zero array
double * zeros = new double[this->mHeight];
for (int j = 0; j < this->mHeight; j++){
zeros[j] = 0.0;
}
//zero out Output array
for (int i = 0; i < this->mWidth; i++){
memcpy(&(this->mOutput[i*this->mHeight]), zeros, this->mHeight*8);
}
double index = 0.0;
for (int ioutter = chipsize; ioutter < (this->mWidth - chipsize); ioutter++){
//write out processing status
//index = (double)ioutter;
//fwrite(&index, 8, 1, stdout);
//fflush(stdout);
//*
for (int j = chipsize; j < (this->mHeight - chipsize); j++){
//clear the histograms
lHistX.clear();
lHistY.clear();
lHistXY.clear();
//chip out a section of the image
for (int k = -chipsize; k <= chipsize; k++){
for (int l = -chipsize; l <= chipsize; l++){
ip = ioutter + k;
jp = j + l;
//update X histogram
if (lHistX.count(int(this->mData[ip*this->mHeight + jp]))){
lHistX[int(this->mData[ip*this->mHeight + jp])] += 1.0;
}else{
lHistX[int(this->mData[ip*this->mHeight + jp])] = 1.0;
}
//update Y histogram
if (lHistY.count(int(this->mData[ip*this->mHeight + jp+this->mOffset]))){
lHistY[int(this->mData[ip*this->mHeight + jp+this->mOffset])] += 1.0;
}
else{
lHistY[int(this->mData[ip*this->mHeight + jp+this->mOffset])] = 1.0;
}
//update X and Y Histogram
if (lHistXY.count(int(this->mData[ip*this->mHeight + jp]))){
//X Key exists check if Y key exists
if (lHistXY[int(this->mData[ip*this->mHeight + jp])].count(int(this->mData[ip*this->mHeight + jp + this->mOffset]))){
//X & Y keys exist
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] += 1;
}else{
//X exist but Y doesn't
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] = 1;
}
}else{
//X Key Didn't exist
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] = 1;
};
}
}
//calculate PMI, Hx, Hy
// iterator->first = key
// iterator->second = value
MI = 0.0;
Hx = 0.0;
Hy = 0.0;
for (HistXY_iter Hist2D_iter = lHistXY.begin(); Hist2D_iter != lHistXY.end(); Hist2D_iter++) {
Px = lHistX[Hist2D_iter->first] / ((double) this->mOffset);
Hx -= Px*log(Px);
for (HistY_iter HistY_iter = Hist2D_iter->second.begin(); HistY_iter != Hist2D_iter->second.end(); HistY_iter++) {
Py = lHistY[HistY_iter->first] / ((double) this->mOffset);
Hy -= Py*log(Py);
Pxy = HistY_iter->second / ((double) this->mOffset);
MI += Pxy*log(Pxy / Py / Px);
}
}
//normalize PMI to max(Hx,Hy) so that the PMI value runs from 0 to 1
if (Hx >= Hy && Hx > 0.0){
MI /= Hx;
}else if(Hy > Hx && Hy > 0.0){
MI /= Hy;
}
else{
MI = 0.0;
}
//write PMI to data output array
if (MI < 1.1){
this->mOutput[ioutter*this->mHeight + j] = MI;
}
else{
this->mOutput[ioutter*this->mHeight + j] = 0.0;
}
}
}
return rvalue;
}
对于返回有意义的东西的数组,我得到的输出范围在 0 和 1 之间,如下所示:
(0.0, 0.0, 0.0, 0.7160627908692593, 0.6376472316395495, 0.5728801401524277,...
对于 2Kx2K 或更大的数组,我得到这样的无意义(即使代码将值限制在 0 和 1 之间):
(-2.2491400820412374e+228, -2.2491400820412374e+228, -2.2491400820412374e+228, -2.2491400820412374e+228, -2.2491400820412378,4e+22
我想知道为什么这段代码在分配到 0.0 和 1 之间后会破坏数据集,这是否是管道问题、标准输入/标准输出问题、某种缓冲区问题或我根本没有看到编码问题。
更新 我尝试使用 Chris 建议的代码以更小的 block 传递数据,但没有成功。还要注意的是,我在 stdout 上添加了一个用于 ferror 的 catch,它从未被触发,所以我很确定这些字节至少可以到达 stdout。是否有其他东西以某种方式写入标准输出?在我的程序运行时,可能会有一个额外的字节进入标准输出?我觉得这值得怀疑,因为在第 10 个条目中的第 4 个 fwrite 读取中始终出现错误。
根据 Craig 的要求,这里是完整的 C++ 代码(完整的 Python 代码已经发布):它位于 3 个文件中:
主要.cpp
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <iostream>
#include "./MutualInformation.h"
double * data;
using namespace std;
void
xxwrite(unsigned char *buf, size_t wlen, FILE *fo)
{
size_t xlen;
for (; wlen > 0; wlen -= xlen, buf += xlen) {
xlen = wlen;
if (xlen > 1024)
xlen = 1024;
xlen = fwrite(buf, 1, xlen, fo);
fflush(fo);
}
}
int main(int argc, char **argv) {
int count = 0;
long totalbytes = stoi(argv[argc-4], nullptr,10); //bytes being transfered
long bytechunk = stoi(argv[argc - 3], nullptr, 10); //bytes being transfered at a time
long height = stoi(argv[argc-2], nullptr, 10); //bytes being transfered at a time
long width = stoi(argv[argc-1], nullptr, 10); //bytes being transfered at a time
long offset = totalbytes / sizeof(double) / 2;
data = new double[totalbytes/sizeof(double)];
int columnindex = 0;
//read in data from pipe
while (count<totalbytes) {
fread(&(data[columnindex]), 1, bytechunk, stdin);
columnindex += bytechunk / sizeof(double);
count += bytechunk;
}
//calculate the data transform
MutualInformation MI = MutualInformation();
MI.Initialize(data, height, width, offset);
MI.calcMI();
count = 0;
columnindex = 0;
while (count<totalbytes/2) {
xxwrite((unsigned char*)&(MI.getOutput()[columnindex]), bytechunk, stdout);
count += bytechunk;
columnindex += bytechunk/sizeof(double);
}
delete [] data;
return 0;
}
互信息.h
#include <map>
using namespace std;
class MutualInformation
{
private:
double * mData;
double * mOutput;
long mHeight;
long mWidth;
long mOffset;
public:
MutualInformation();
~MutualInformation();
bool Initialize(double * data, long Height, long Width, long Offset);
const double * getOutput();
double calcMI();
};
互信息.cpp
#include "MutualInformation.h"
MutualInformation::MutualInformation()
{
this->mData = nullptr;
this->mOutput = nullptr;
this->mHeight = 0;
this->mWidth = 0;
}
MutualInformation::~MutualInformation()
{
delete[] this->mOutput;
}
bool MutualInformation::Initialize(double * data, long Height, long Width, long Offset){
bool rvalue = false;
this->mData = data;
this->mHeight = Height;
this->mWidth = Width;
this->mOffset = Offset;
//allocate output data
this->mOutput = new double[this->mHeight*this->mWidth];
return rvalue;
}
const double * MutualInformation::getOutput(){
return this->mOutput;
}
double MutualInformation::calcMI(){
double rvalue = 0.0;
std::map<int, map<int, double>> lHistXY = map<int, map<int, double>>();
std::map<int, double> lHistX = map<int, double>();
std::map<int, double> lHistY = map<int, double>();
typedef std::map<int, std::map<int, double>>::iterator HistXY_iter;
typedef std::map<int, double>::iterator HistY_iter;
//calculate Entropys and MI
double MI = 0.0;
double Hx = 0.0;
double Hy = 0.0;
double Px = 0.0;
double Py = 0.0;
double Pxy = 0.0;
//scan through the image
int ip = 0;
int jp = 0;
int chipsize = 3;
//setup zero array
double * zeros = new double[this->mHeight];
for (int j = 0; j < this->mHeight; j++){
zeros[j] = 0.0;
}
//zero out Output array
for (int i = 0; i < this->mWidth; i++){
memcpy(&(this->mOutput[i*this->mHeight]), zeros, this->mHeight*8);
}
double index = 0.0;
for (int ioutter = chipsize; ioutter < (this->mWidth - chipsize); ioutter++){
for (int j = chipsize; j < (this->mHeight - chipsize); j++){
//clear the histograms
lHistX.clear();
lHistY.clear();
lHistXY.clear();
//chip out a section of the image
for (int k = -chipsize; k <= chipsize; k++){
for (int l = -chipsize; l <= chipsize; l++){
ip = ioutter + k;
jp = j + l;
//update X histogram
if (lHistX.count(int(this->mData[ip*this->mHeight + jp]))){
lHistX[int(this->mData[ip*this->mHeight + jp])] += 1.0;
}else{
lHistX[int(this->mData[ip*this->mHeight + jp])] = 1.0;
}
//update Y histogram
if (lHistY.count(int(this->mData[ip*this->mHeight + jp+this->mOffset]))){
lHistY[int(this->mData[ip*this->mHeight + jp+this->mOffset])] += 1.0;
}
else{
lHistY[int(this->mData[ip*this->mHeight + jp+this->mOffset])] = 1.0;
}
//update X and Y Histogram
if (lHistXY.count(int(this->mData[ip*this->mHeight + jp]))){
//X Key exists check if Y key exists
if (lHistXY[int(this->mData[ip*this->mHeight + jp])].count(int(this->mData[ip*this->mHeight + jp + this->mOffset]))){
//X & Y keys exist
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] += 1;
}else{
//X exist but Y doesn't
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] = 1;
}
}else{
//X Key Didn't exist
lHistXY[int(this->mData[ip*this->mHeight + jp])][int(this->mData[ip*this->mHeight + jp + this->mOffset])] = 1;
};
}
}
//calculate PMI, Hx, Hy
// iterator->first = key
// iterator->second = value
MI = 0.0;
Hx = 0.0;
Hy = 0.0;
for (HistXY_iter Hist2D_iter = lHistXY.begin(); Hist2D_iter != lHistXY.end(); Hist2D_iter++) {
Px = lHistX[Hist2D_iter->first] / ((double) this->mOffset);
Hx -= Px*log(Px);
for (HistY_iter HistY_iter = Hist2D_iter->second.begin(); HistY_iter != Hist2D_iter->second.end(); HistY_iter++) {
Py = lHistY[HistY_iter->first] / ((double) this->mOffset);
Hy -= Py*log(Py);
Pxy = HistY_iter->second / ((double) this->mOffset);
MI += Pxy*log(Pxy / Py / Px);
}
}
//normalize PMI to max(Hx,Hy) so that the PMI value runs from 0 to 1
if (Hx >= Hy && Hx > 0.0){
MI /= Hx;
}else if(Hy > Hx && Hy > 0.0){
MI /= Hy;
}
else{
MI = 0.0;
}
//write PMI to data output array
if (MI < 1.1){
this->mOutput[ioutter*this->mHeight + j] = MI;
}
else{
this->mOutput[ioutter*this->mHeight + j] = 0.0;
//cout << "problem with output";
}
}
}
//*/
return rvalue;
}
6502 解决
6502 下面的回答解决了我的问题。我需要明确告诉 Windows 对标准输入/标准输出使用二进制模式。为此,我必须在我的主 cpp 文件中包含 2 个新的头文件。
#include <fcntl.h>
#include <io.h>
将以下代码行(由于 Visual Studio 提示而从 6502 的 POSIX 版本中修改)添加到我的主要功能的开头
_setmode(_fileno(stdout), O_BINARY);
_setmode(_fileno(stdin), O_BINARY);
然后将这些行添加到我的 Python 代码中:
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
最佳答案
问题是 Windows 中的 stdin/stdout 是以文本模式打开的,而不是以二进制模式打开的,因此当字符 13 (\r) 被发送。
例如,您可以在 Python 中设置二进制模式
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
在 C++ 中用
_setmode(fileno(stdout), O_BINARY);
_setmode(fileno(stdin), O_BINARY);
关于python - 数据损坏 C++ 和 Python 之间的管道,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36987437/
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题?更新问题,以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点?我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等),但我确实想创建一个输出文件。
我主要使用Ruby来执行此操作,但到目前为止我的攻击计划如下:使用gemsrdf、rdf-rdfa和rdf-microdata或mida来解析给定任何URI的数据。我认为最好映射到像schema.org这样的统一模式,例如使用这个yaml文件,它试图描述数据词汇表和opengraph到schema.org之间的转换:#SchemaXtoschema.orgconversion#data-vocabularyDV:name:namestreet-address:streetAddressregion:addressRegionlocality:addressLocalityphoto:i
我构建了两个需要相互通信和发送文件的Rails应用程序。例如,一个Rails应用程序会发送请求以查看其他应用程序数据库中的表。然后另一个应用程序将呈现该表的json并将其发回。我还希望一个应用程序将存储在其公共(public)目录中的文本文件发送到另一个应用程序的公共(public)目录。我从来没有做过这样的事情,所以我什至不知道从哪里开始。任何帮助,将不胜感激。谢谢! 最佳答案 无论Rails是什么,几乎所有Web应用程序都有您的要求,大多数现代Web应用程序都需要相互通信。但是有一个小小的理解需要你坚持下去,网站不应直接访问彼此
我的瘦服务器配置了nginx,我的ROR应用程序正在它们上运行。在我发布代码更新时运行thinrestart会给我的应用程序带来一些停机时间。我试图弄清楚如何优雅地重启正在运行的Thin实例,但找不到好的解决方案。有没有人能做到这一点? 最佳答案 #Restartjustthethinserverdescribedbythatconfigsudothin-C/etc/thin/mysite.ymlrestartNginx将继续运行并代理请求。如果您将Nginx设置为使用多个上游服务器,例如server{listen80;server
在Cooper的书BeginningRuby中,第166页有一个我无法重现的示例。classSongincludeComparableattr_accessor:lengthdef(other)@lengthother.lengthenddefinitialize(song_name,length)@song_name=song_name@length=lengthendenda=Song.new('Rockaroundtheclock',143)b=Song.new('BohemianRhapsody',544)c=Song.new('MinuteWaltz',60)a.betwee
有时我需要处理键/值数据。我不喜欢使用数组,因为它们在大小上没有限制(很容易不小心添加超过2个项目,而且您最终需要稍后验证大小)。此外,0和1的索引变成了魔数(MagicNumber),并且在传达含义方面做得很差(“当我说0时,我的意思是head...”)。散列也不合适,因为可能会不小心添加额外的条目。我写了下面的类来解决这个问题:classPairattr_accessor:head,:taildefinitialize(h,t)@head,@tail=h,tendend它工作得很好并且解决了问题,但我很想知道:Ruby标准库是否已经带有这样一个类? 最佳
我正在检查一个Rails项目。在ERubyHTML模板页面上,我看到了这样几行:我不明白为什么不这样写:在这种情况下,||=和ifnil?有什么区别? 最佳答案 在这种特殊情况下没有区别,但可能是出于习惯。每当我看到nil?被使用时,它几乎总是使用不当。在Ruby中,很少有东西在逻辑上是假的,只有文字false和nil是。这意味着像if(!x.nil?)这样的代码几乎总是更好地表示为if(x)除非期望x可能是文字false。我会将其切换为||=false,因为它具有相同的结果,但这在很大程度上取决于偏好。唯一的缺点是赋值会在每次运行
这个问题在这里已经有了答案:关闭10年前。PossibleDuplicate:Pythonconditionalassignmentoperator对于这样一个简单的问题表示歉意,但是谷歌搜索||=并不是很有帮助;)Python中是否有与Ruby和Perl中的||=语句等效的语句?例如:foo="hey"foo||="what"#assignfooifit'sundefined#fooisstill"hey"bar||="yeah"#baris"yeah"另外,类似这样的东西的通用术语是什么?条件分配是我的第一个猜测,但Wikipediapage跟我想的不太一样。
什么是ruby的rack或python的Java的wsgi?还有一个路由库。 最佳答案 来自Python标准PEP333:Bycontrast,althoughJavahasjustasmanywebapplicationframeworksavailable,Java's"servlet"APImakesitpossibleforapplicationswrittenwithanyJavawebapplicationframeworktoruninanywebserverthatsupportstheservletAPI.ht
我正在尝试使用Curbgem执行以下POST以解析云curl-XPOST\-H"X-Parse-Application-Id:PARSE_APP_ID"\-H"X-Parse-REST-API-Key:PARSE_API_KEY"\-H"Content-Type:image/jpeg"\--data-binary'@myPicture.jpg'\https://api.parse.com/1/files/pic.jpg用这个:curl=Curl::Easy.new("https://api.parse.com/1/files/lion.jpg")curl.multipart_form_