jjzjj

c# - 用于快速插入/删除的 .net 集合

coder 2024-05-26 原文

我需要维护一个连接客户名册,这些客户的生命周期很短,而且经常上下波动。由于潜在的客户端数量,我需要一个支持快速插入/删除的集合。建议?

最佳答案

C5 通用集合库

我在 C# 和 C++ 中找到的最好的实现是这些——对于 C#/CLI:

  • http://www.itu.dk/research/c5/Release1.1/ITU-TR-2006-76.pdf
  • http://www.itu.dk/research/c5/

  • 它经过充分研究,具有可扩展的单元测试,并且自 2 月以来,他们还在 .Net 中实现了通用接口(interface),这使得使用集合变得更加容易。他们在 Channel9 上有特色他们已经对集合进行了广泛的性能测试。

    如果您无论如何都在使用数据结构,这些研究人员有一个 red-black-tree在他们的库中实现,类似于你在启动 Lütz 反射器并查看 System.Data 的内部结构时所发现的:p。插入复杂度:O(log(n))。

    无锁 C++ 集合

    那么,如果可以的话allow for some C++ interop并且您绝对需要速度并希望尽可能少的开销,那么来自 Dmitriy V'jukov 的这些无锁 ADT 可能是您在这个世界上可以获得的最好的,其性能优于英特尔的并发 ADT 库。
  • http://groups.google.com/group/lock-free

  • 我已经阅读了一些代码,它确实是精通这些东西如何组合在一起的人的气质。 VC++ 可以在没有烦人的边界的情况下进行本地 C++ 互操作。 http://www.swig.org/否则可以帮助您包装 C++ 接口(interface)以在 .Net 中使用,或者您可以通过 P/Invoke 自己完成。

    微软采取

    他们有写教程,this one implementing a rather unpolished skip-list in C# ,并讨论其他类型的数据结构。 (有一个更好的 SkipList at CodeProject ,它非常优美并且以良好的方式实现了接口(interface)。)它们还有一些与 .Net 捆绑在一起的数据结构,即 HashTable/Dictionary<,>HashSet .当然还有“ResizeArray”/List 类型以及堆栈和队列,但它们在搜索中都是“线性的”。

    谷歌的性能工具

    如果您希望加快内存分配所需的时间,您可以使用 google 的 perf-tools。它们可以在 google 代码中找到,并且包含一个 very interesting multi-threaded malloc-implementation (TCMalloc)这显示出比普通 malloc 更一致的时序。您可以将它与上面的无锁结构一起使用,以真正为性能而疯狂。

    通过内存提高响应时间

    您还可以对函数使用内存功能,通过缓存来提高性能,如果您正在使用,这会很有趣,例如F# . F# 还允许 C++ 互操作,所以你没问题。

    好的)

    也有可能使用在 bloom-filters 上完成的研究自己做一些事情。 ,这允许 O(k) 查找复杂度,其中 k 是一个常数,取决于您已实现的散列函数的数量。这就是谷歌的 BigTable 是如何实现的。如果元素在集合中,或者可能具有非常低的可能性,这些过滤器将为您获取不是您要查找的元素的元素(请参阅维基百科上的图表 - 它接近 P(wrong_key) -> 0.01 作为大小大约有 10000 个元素,但您可以通过实现进一步的哈希函数/减少集合来解决这个问题。

    我没有搜索过它的 .Net 实现,但由于散列计算是独立的,您可以使用 MS's performance team's implementation of Tasks to speed that up.

    “我的”采取——随机化以达到平均 O(log n)

    碰巧我刚刚做了一个涉及数据结构的类(class)。在这种情况下,我们使用了 C++,但很容易转换为 C#。我们构建了三种不同的数据结构;布隆过滤器、跳过列表和 random binary search tree .

    见最后一段后的代码和分析。

    基于硬件的“集合”

    最后,为了让我的回答“完整”,如果你真的需要速度,你可以使用类似 Routing-tables 的东西。或 Content-addressable memory .这使您原则上可以非常快速地进行 O(1) 的“散列”到值的数据查找。

    随机二叉搜索树/布隆过滤器 C++ 代码

    如果您在代码中发现错误,或者只是指出我如何做得更好(或更好地使用模板),我将非常感谢您的反馈。请注意,布隆过滤器与现实生活中的不同;通常,您不必能够从中删除,然后它比我为允许测试删除所做的 hack 更节省空间。

    数据结构.h
    #ifndef DATASTRUCTURE_H_
    #define DATASTRUCTURE_H_
    
    class DataStructure
    {
    public:
        DataStructure() {countAdd=0; countDelete=0;countFind=0;}
        virtual ~DataStructure() {}
    
        void resetCountAdd() {countAdd=0;}
        void resetCountFind() {countFind=0;}
        void resetCountDelete() {countDelete=0;}
    
        unsigned int getCountAdd(){return countAdd;}
        unsigned int getCountDelete(){return countDelete;}
        unsigned int getCountFind(){return countFind;}
    
    protected:
        unsigned int countAdd;
        unsigned int countDelete;
        unsigned int countFind;
    };
    
    #endif /*DATASTRUCTURE_H_*/
    

    Key.h
    #ifndef KEY_H_
    #define KEY_H_
    
    #include <string>
    using namespace std;
    
    const int keyLength = 128;
    
    class Key : public string
    {
    public:
        Key():string(keyLength, ' ') {}
        Key(const char in[]): string(in){}
        Key(const string& in): string(in){}
    
        bool operator<(const string& other);
        bool operator>(const string& other);
        bool operator==(const string& other);
    
        virtual ~Key() {}
    };
    
    #endif /*KEY_H_*/
    

    key .cpp
    #include "Key.h"
    
    bool Key::operator<(const string& other)
    {
        return compare(other) < 0;
    };
    
    bool Key::operator>(const string& other)
    {
        return compare(other) > 0;
    };
    
    bool Key::operator==(const string& other)
    {
        return compare(other) == 0;
    }
    

    布隆过滤器.h
    #ifndef BLOOMFILTER_H_
    #define BLOOMFILTER_H_
    
    #include <iostream>
    #include <assert.h>
    #include <vector>
    #include <math.h>
    #include "Key.h"
    #include "DataStructure.h"
    
    #define LONG_BIT 32
    #define bitmask(val) (unsigned long)(1 << (LONG_BIT - (val % LONG_BIT) - 1))
    
    // TODO: Implement RW-locking on the reads/writes to the bitmap.
    
    class BloomFilter : public DataStructure
    {
    public:
        BloomFilter(){}
        BloomFilter(unsigned long length){init(length);}
        virtual ~BloomFilter(){}
    
        void init(unsigned long length);
        void dump();
    
        void add(const Key& key);
        void del(const Key& key);
    
        /**
         * Returns true if the key IS BELIEVED to exist, false if it absolutely doesn't.
         */
        bool testExist(const Key& key, bool v = false);
    
    private:
        unsigned long hash1(const Key& key);
        unsigned long hash2(const Key& key);
        bool exist(const Key& key);
        void getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1, int& i2, const Key& key);
        void getCountIndicies(const int i1, const unsigned long h1,
            const int i2, const unsigned long h2, int& i1_c, int& i2_c);
    
        vector<unsigned long> m_tickBook;
        vector<unsigned int> m_useCounts;
        unsigned long m_length; // number of bits in the bloom filter
        unsigned long m_pockets; //the number of pockets
    
        static const unsigned long m_pocketSize; //bits in each pocket
    };
    
    #endif /*BLOOMFILTER_H_*/
    

    BloomFilter.cpp
    #include "BloomFilter.h"
    
    const unsigned long BloomFilter::m_pocketSize = LONG_BIT;
    
    void BloomFilter::init(unsigned long length)
    {
        //m_length = length;
        m_length = (unsigned long)((2.0*length)/log(2))+1;
        m_pockets = (unsigned long)(ceil(double(m_length)/m_pocketSize));
        m_tickBook.resize(m_pockets);
    
        // my own (allocate nr bits possible to store in the other vector)
        m_useCounts.resize(m_pockets * m_pocketSize);
    
        unsigned long i; for(i=0; i< m_pockets; i++) m_tickBook[i] = 0;
        for (i = 0; i < m_useCounts.size(); i++) m_useCounts[i] = 0; // my own
    }
    
    unsigned long BloomFilter::hash1(const Key& key)
    {
        unsigned long hash = 5381;
        unsigned int i=0; for (i=0; i< key.length(); i++){
            hash = ((hash << 5) + hash) + key.c_str()[i]; /* hash * 33 + c */
        }
    
        double d_hash = (double) hash;
    
        d_hash *= (0.5*(sqrt(5)-1));
        d_hash -= floor(d_hash);
        d_hash *= (double)m_length;
    
        return (unsigned long)floor(d_hash);
    }
    
    unsigned long BloomFilter::hash2(const Key& key)
    {
        unsigned long hash = 0;
        unsigned int i=0; for (i=0; i< key.length(); i++){
            hash = key.c_str()[i] + (hash << 6) + (hash << 16) - hash;
        }
        double d_hash = (double) hash;
    
        d_hash *= (0.5*(sqrt(5)-1));
        d_hash -= floor(d_hash);
        d_hash *= (double)m_length;
    
        return (unsigned long)floor(d_hash);
    }
    
    bool BloomFilter::testExist(const Key& key, bool v){
        if(exist(key)) {
            if(v) cout<<"Key "<< key<<" is in the set"<<endl;
            return true;
        }else {
            if(v) cout<<"Key "<< key<<" is not in the set"<<endl;
            return false;
        }
    }
    
    void BloomFilter::dump()
    {
        cout<<m_pockets<<" Pockets: ";
    
        // I changed u to %p because I wanted it printed in hex.
        unsigned long i; for(i=0; i< m_pockets; i++) printf("%p ", (void*)m_tickBook[i]);
        cout<<endl;
    }
    
    void BloomFilter::add(const Key& key)
    {
        unsigned long h1, h2;
        int i1, i2;
        int i1_c, i2_c;
    
        // tested!
    
        getHashAndIndicies(h1, h2, i1, i2, key);
        getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);
    
        m_tickBook[i1] = m_tickBook[i1] | bitmask(h1);
        m_tickBook[i2] = m_tickBook[i2] | bitmask(h2);
    
        m_useCounts[i1_c] = m_useCounts[i1_c] + 1;
        m_useCounts[i2_c] = m_useCounts[i2_c] + 1;
    
        countAdd++;
    }
    
    void BloomFilter::del(const Key& key)
    {
        unsigned long h1, h2;
        int i1, i2;
        int i1_c, i2_c;
    
        if (!exist(key)) throw "You can't delete keys which are not in the bloom filter!";
    
        // First we need the indicies into m_tickBook and the
        // hashes.
        getHashAndIndicies(h1, h2, i1, i2, key);
    
        // The index of the counter is the index into the bitvector
        // times the number of bits per vector item plus the offset into
        // that same vector item.
        getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);
    
        // We need to update the value in the bitvector in order to
        // delete the key.
        m_useCounts[i1_c] = (m_useCounts[i1_c] == 1 ? 0 : m_useCounts[i1_c] - 1);
        m_useCounts[i2_c] = (m_useCounts[i2_c] == 1 ? 0 : m_useCounts[i2_c] - 1);
    
        // Now, if we depleted the count for a specific bit, then set it to
        // zero, by anding the complete unsigned long with the notted bitmask
        // of the hash value
        if (m_useCounts[i1_c] == 0)
            m_tickBook[i1] = m_tickBook[i1] & ~(bitmask(h1));
        if (m_useCounts[i2_c] == 0)
            m_tickBook[i2] = m_tickBook[i2] & ~(bitmask(h2));
    
        countDelete++;
    }
    
    bool BloomFilter::exist(const Key& key)
    {
        unsigned long h1, h2;
        int i1, i2;
    
        countFind++;
    
        getHashAndIndicies(h1, h2, i1, i2, key);
    
        return  ((m_tickBook[i1] & bitmask(h1)) > 0) &&
                ((m_tickBook[i2] & bitmask(h2)) > 0);
    }
    
    /*
     * Gets the values of the indicies for two hashes and places them in
     * the passed parameters. The index is into m_tickBook.
     */
    void BloomFilter::getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1,
        int& i2, const Key& key)
    {
        h1 = hash1(key);
        h2 = hash2(key);
        i1 = (int) h1/m_pocketSize;
        i2 = (int) h2/m_pocketSize;
    }
    
    /*
     * Gets the values of the indicies into the count vector, which keeps
     * track of how many times a specific bit-position has been used.
     */
    void BloomFilter::getCountIndicies(const int i1, const unsigned long h1,
        const int i2, const unsigned long h2, int& i1_c, int& i2_c)
    {
        i1_c = i1*m_pocketSize + h1%m_pocketSize;
        i2_c = i2*m_pocketSize + h2%m_pocketSize;
    }
    

    ** RBST.h **
    #ifndef RBST_H_
    #define RBST_H_
    
    #include <iostream>
    #include <assert.h>
    #include <vector>
    #include <math.h>
    #include "Key.h"
    #include "DataStructure.h"
    
    #define BUG(str) printf("%s:%d FAILED SIZE INVARIANT: %s\n", __FILE__, __LINE__, str);
    
    using namespace std;
    
    class RBSTNode;
    class RBSTNode: public Key
    {
    public:
        RBSTNode(const Key& key):Key(key)
        {
            m_left =NULL;
            m_right = NULL;
            m_size = 1U; // the size of one node is 1.
        }
        virtual ~RBSTNode(){}
    
        string setKey(const Key& key){return Key(key);}
    
        RBSTNode* left(){return m_left; }
        RBSTNode* right(){return m_right;}
    
        RBSTNode* setLeft(RBSTNode* left) { m_left = left; return this; }
        RBSTNode* setRight(RBSTNode* right) { m_right =right; return this; }
    
    #ifdef DEBUG
        ostream& print(ostream& out)
        {
            out << "Key(" << *this << ", m_size: " << m_size << ")";
            return out;
        }
    #endif
    
        unsigned int size() { return m_size; }
    
        void setSize(unsigned int val)
        {
    #ifdef DEBUG
            this->print(cout);
            cout << "::setSize(" << val << ") called." << endl;
    #endif
    
            if (val == 0) throw "Cannot set the size below 1, then just delete this node.";
            m_size = val;
        }
    
        void incSize() {
    #ifdef DEBUG
            this->print(cout);
            cout << "::incSize() called" << endl;
    #endif
    
            m_size++;
        }
    
        void decrSize()
        {
    #ifdef DEBUG
            this->print(cout);
            cout << "::decrSize() called" << endl;
    #endif
    
            if (m_size == 1) throw "Cannot decrement size below 1, then just delete this node.";
            m_size--;
        }
    
    #ifdef DEBUG
        unsigned int size(RBSTNode* x);
    #endif
    
    private:
        RBSTNode(){}
        RBSTNode* m_left;
        RBSTNode* m_right;
        unsigned int m_size;
    };
    
    class RBST : public DataStructure
    {
    public:
        RBST() {
            m_size = 0;
            m_head = NULL;
            srand(time(0));
        };
    
        virtual ~RBST() {};
    
        /**
         * Tries to add key into the tree and will return
         *      true  for a new item added
         *      false if the key already is in the tree.
         *
         * Will also have the side-effect of printing to the console if v=true.
         */
        bool add(const Key& key, bool v=false);
    
        /**
         * Same semantics as other add function, but takes a string,
         * but diff name, because that'll cause an ambiguity because of inheritance.
         */
        bool addString(const string& key);
    
        /**
         * Deletes a key from the tree if that key is in the tree.
         * Will return
         *      true  for success and
         *      false for failure.
         *
         * Will also have the side-effect of printing to the console if v=true.
         */
        bool del(const Key& key, bool v=false);
    
        /**
         * Tries to find the key in the tree and will return
         *      true if the key is in the tree and
         *      false if the key is not.
         *
         * Will also have the side-effect of printing to the console if v=true.
         */
        bool find(const Key& key, bool v = false);
    
        unsigned int count() { return m_size; }
    
    #ifdef DEBUG
        int dump(char sep = ' ');
        int dump(RBSTNode* target, char sep);
        unsigned int size(RBSTNode* x);
    #endif
    
    private:
        RBSTNode* randomAdd(RBSTNode* target, const Key& key);
        RBSTNode* addRoot(RBSTNode* target, const Key& key);
        RBSTNode* rightRotate(RBSTNode* target);
        RBSTNode* leftRotate(RBSTNode* target);
    
        RBSTNode* del(RBSTNode* target, const Key& key);
        RBSTNode* join(RBSTNode* left, RBSTNode* right);
    
        RBSTNode* find(RBSTNode* target, const Key& key);
    
        RBSTNode* m_head;
        unsigned int m_size;
    };
    
    #endif /*RBST_H_*/
    

    ** RBST.cpp **
    #include "RBST.h"
    
    bool RBST::add(const Key& key, bool v){
        unsigned int oldSize = m_size;
        m_head = randomAdd(m_head, key);
        if (m_size > oldSize){
            if(v) cout<<"Node "<<key<< " is added into the tree."<<endl;
            return true;
        }else {
            if(v) cout<<"Node "<<key<< " is already in the tree."<<endl;
            return false;
        }
        if(v) cout<<endl;
    };
    
    bool RBST::addString(const string& key) {
        return add(Key(key), false);
    }
    
    bool RBST::del(const Key& key, bool v){
        unsigned oldSize= m_size;
        m_head = del(m_head, key);
        if (m_size < oldSize) {
            if(v) cout<<"Node "<<key<< " is deleted from the tree."<<endl;
            return true;
        }
        else {
            if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
            return false;
        }
    };
    
    bool RBST::find(const Key& key, bool v){
        RBSTNode* ret = find(m_head, key);
        if (ret == NULL){
            if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
            return false;
        }else {
            if(v) cout<<"Node "<<key<< " is in the tree."<<endl;
            return true;
        }
    };
    
    #ifdef DEBUG
    int RBST::dump(char sep){
        int ret = dump(m_head, sep);
        cout<<"SIZE: " <<ret<<endl;
        return ret;
    };
    
    int RBST::dump(RBSTNode* target, char sep){
        if (target == NULL) return 0;
        int ret = dump(target->left(), sep);
        cout<< *target<<sep;
        ret ++;
        ret += dump(target->right(), sep);
        return ret;
    };
    #endif
    
    /**
     * Rotates the tree around target, so that target's left
     * is the new root of the tree/subtree and updates the subtree sizes.
     *
     *(target)  b               (l) a
     *         / \      right      / \
     *        a   ?     ---->     ?   b
     *       / \                     / \
     *      ?   x                   x   ?
     *
     */
    RBSTNode* RBST::rightRotate(RBSTNode* target) // private
    {
        if (target == NULL) throw "Invariant failure, target is null"; // Note: may be removed once tested.
        if (target->left() == NULL) throw "You cannot rotate right around a target whose left node is NULL!";
    
    #ifdef DEBUG
        cout    <<"Right-rotating b-node ";
        target->print(cout);
        cout    << " for a-node ";
        target->left()->print(cout);
        cout    << "." << endl;
    #endif
    
        RBSTNode* l = target->left();
        int as0 = l->size();
    
        // re-order the sizes
        l->setSize( l->size() + (target->right() == NULL ? 0 : target->right()->size()) + 1); // a.size += b.right.size + 1; where b.right may be null.
        target->setSize( target->size() -as0 + (l->right() == NULL ? 0 : l->right()->size()) ); // b.size += -a_0_size + x.size where x may be null.
    
        // swap b's left (for a)
        target->setLeft(l->right());
    
        // and a's right (for b's left)
        l->setRight(target);
    
    #ifdef DEBUG
        cout    << "A-node size: " << l->size() << ", b-node size: " << target->size() << "." << endl;
    #endif
    
        // return the new root, a.
        return l;
    };
    
    /**
     * Like rightRotate, but the other way. See docs for rightRotate(RBSTNode*)
     */
    RBSTNode* RBST::leftRotate(RBSTNode* target)
    {
        if (target == NULL) throw "Invariant failure, target is null";
        if (target->right() == NULL) throw "You cannot rotate left around a target whose right node is NULL!";
    
    #ifdef DEBUG
        cout    <<"Left-rotating a-node ";
        target->print(cout);
        cout    << " for b-node ";
        target->right()->print(cout);
        cout    << "." << endl;
    #endif
    
        RBSTNode* r = target->right();
        int bs0 = r->size();
    
        // re-roder the sizes
        r->setSize(r->size() + (target->left() == NULL ? 0 : target->left()->size()) + 1);
        target->setSize(target->size() -bs0 + (r->left() == NULL ? 0 : r->left()->size()));
    
        // swap a's right (for b's left)
        target->setRight(r->left());
    
        // swap b's left (for a)
        r->setLeft(target);
    
    #ifdef DEBUG
        cout    << "Left-rotation done: a-node size: " << target->size() << ", b-node size: " << r->size() << "." << endl;
    #endif
    
        return r;
    };
    
    //
    /**
     * Adds a key to the tree and returns the new root of the tree.
     * If the key already exists doesn't add anything.
     * Increments m_size if the key didn't already exist and hence was added.
     *
     * This function is not called from public methods, it's a helper function.
     */
    RBSTNode* RBST::addRoot(RBSTNode* target, const Key& key)
    {
        countAdd++;
    
        if (target == NULL) return new RBSTNode(key);
    
    #ifdef DEBUG
        cout << "addRoot(";
        cout.flush();
        target->print(cout) << "," << key << ") called." << endl;
    #endif
    
        if (*target < key)
        {
            target->setRight( addRoot(target->right(), key) );
            target->incSize(); // Should I?
            RBSTNode* res = leftRotate(target);
    #ifdef DEBUG
            if (target->size() != size(target))
                BUG("in addRoot 1");
    #endif
            return res;
        }
    
        target->setLeft( addRoot(target->left(), key) );
        target->incSize(); // Should I?
        RBSTNode* res = rightRotate(target);
    #ifdef DEBUG
        if (target->size() != size(target))
            BUG("in addRoot 2");
    #endif
        return res;
    };
    
    /**
     * This function is called from the public add(key) function,
     * and returns the new root node.
     */
    RBSTNode* RBST::randomAdd(RBSTNode* target, const Key& key)
    {
        countAdd++;
    
        if (target == NULL)
        {
            m_size++;
            return new RBSTNode(key);
        }
    
    #ifdef DEBUG
        cout << "randomAdd(";
        target->print(cout) << ", \"" << key << "\") called." << endl;
    #endif
    
        int r = (rand() % target->size()) + 1;
    
        // here is where we add the target as root!
        if (r == 1)
        {
            m_size++;   // TODO: Need to lock.
            return addRoot(target, key);
        }
    
    #ifdef DEBUG
        printf("randomAdd recursion part, ");
    #endif
    
        // otherwise, continue recursing!
        if (*target <= key)
        {
    #ifdef DEBUG
        printf("target <= key\n");
    #endif
            target->setRight( randomAdd(target->right(), key) );
            target->incSize(); // TODO: Need to lock.
    #ifdef DEBUG
            if (target->right()->size() != size(target->right()))
                BUG("in randomAdd 1");
    #endif
        }
        else
        {
    #ifdef DEBUG
        printf("target > key\n");
    #endif
            target->setLeft( randomAdd(target->left(), key) );
            target->incSize(); // TODO: Need to lock.
    #ifdef DEBUG
            if (target->left()->size() != size(target->left()))
                BUG("in randomAdd 2");
    #endif
        }
    
    #ifdef DEBUG
        printf("randomAdd return part\n");
    #endif
    
        m_size++;       // TODO: Need to lock.
        return target;
    };
    
    /////////////////////////////////////////////////////////////
    /////////////////////  DEL FUNCTIONS ////////////////////////
    /////////////////////////////////////////////////////////////
    
    /**
     * Deletes a node with the passed key.
     * Returns the root node.
     * Decrements m_size if something was deleted.
     */
    RBSTNode* RBST::del(RBSTNode* target, const Key& key)
    {
        countDelete++;
    
        if (target == NULL) return NULL;
    
    #ifdef DEBUG
        cout << "del(";
        target->print(cout) << ", \"" << key << "\") called." << endl;
    #endif
    
        RBSTNode* ret = NULL;
    
        // found the node to delete
        if (*target == key)
        {
            ret = join(target->left(), target->right());
    
            m_size--;
            delete target;
    
            return ret; // return the newly built joined subtree!
        }
    
        // store a temporary size before recursive deletion.
        unsigned int size = m_size;
    
        if (*target < key)  target->setRight( del(target->right(), key) );
        else                target->setLeft( del(target->left(), key) );
    
        // if the previous recursion changed the size, we need to decrement the size of this target too.
        if (m_size < size) target->decrSize();
    
    #ifdef DEBUG
        if (RBST::size(target) != target->size())
            BUG("in del");
    #endif
    
        return target;
    };
    
    /**
     * Joins the two subtrees represented by left and right
     * by randomly choosing which to make the root, weighted on the
     * size of the sub-tree.
     */
    RBSTNode* RBST::join(RBSTNode* left, RBSTNode* right)
    {
        if (left == NULL) return right;
        if (right == NULL) return left;
    
    #ifdef DEBUG
        cout << "join(";
        left->print(cout);
        cout << ",";
        right->print(cout) << ") called." << endl;
    #endif
    
        // Find the chance that we use the left tree, based on its size over the total tree size.
        // 3 s.d. randomness :-p e.g. 60.3% chance.
        bool useLeft = ((rand()%1000) < (signed)((float)left->size()/(float)(left->size() + right->size()) * 1000.0));
    
        RBSTNode* subtree = NULL;
    
        if (useLeft)
        {
            subtree = join(left->right(), right);
    
            left->setRight(subtree)
                ->setSize((left->left() == NULL ? 0 : left->left()->size())
                            + subtree->size() + 1 );
    
    #ifdef DEBUG
            if (size(left) != left->size())
                BUG("in join 1");
    #endif
    
            return left;
        }
    
        subtree = join(right->left(), left);
    
        right->setLeft(subtree)
             ->setSize((right->right() == NULL ? 0 : right->right()->size())
                        + subtree->size() + 1);
    
    #ifdef DEBUG
        if (size(right) != right->size())
            BUG("in join 2");
    #endif
    
        return right;
    };
    
    /////////////////////////////////////////////////////////////
    /////////////////////  FIND FUNCTIONS ///////////////////////
    /////////////////////////////////////////////////////////////
    
    /**
     * Tries to find the key in the tree starting
     * search from target.
     *
     * Returns NULL if it was not found.
     */
    RBSTNode* RBST::find(RBSTNode* target, const Key& key)
    {
        countFind++; // Could use private method only counting the first call.
        if (target == NULL) return NULL; // not found.
        if (*target == key) return target; // found (does string override ==?)
        if (*target < key) return find(target->right(), key); // search for gt to the right.
        return find(target->left(), key); // search for lt to the left.
    };
    
    #ifdef DEBUG
    
    unsigned int RBST::size(RBSTNode* x)
    {
        if (x == NULL) return 0;
        return 1 + size(x->left()) + size(x->right());
    }
    
    #endif
    

    我将再次保存 SkipList,因为已经可以从链接中找到 SkipList 的良好实现,而且我的版本没有太大不同。

    从测试文件生成的图形如下:

    图表显示为 BloomFilter、RBST 和 SkipList 添加新项目所花费的时间。
    graph http://haf.se/content/dl/addtimer.png

    图表显示了为 BloomFilter、RBST 和 SkipList 查找项目所花费的时间
    graph http://haf.se/content/dl/findtimer.png

    图表显示删除 BloomFilter、RBST 和 SkipList 项目所花费的时间
    graph http://haf.se/content/dl/deltimer.png

    正如你所看到的,随机二叉搜索树比 SkipList 好很多。布隆过滤器符合其 O(k)。

    关于c# - 用于快速插入/删除的 .net 集合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/576176/

    有关c# - 用于快速插入/删除的 .net 集合的更多相关文章

    1. ruby-on-rails - Ruby net/ldap 模块中的内存泄漏 - 2

      作为我的Rails应用程序的一部分,我编写了一个小导入程序,它从我们的LDAP系统中吸取数据并将其塞入一个用户表中。不幸的是,与LDAP相关的代码在遍历我们的32K用户时泄漏了大量内存,我一直无法弄清楚如何解决这个问题。这个问题似乎在某种程度上与LDAP库有关,因为当我删除对LDAP内容的调用时,内存使用情况会很好地稳定下来。此外,不断增加的对象是Net::BER::BerIdentifiedString和Net::BER::BerIdentifiedArray,它们都是LDAP库的一部分。当我运行导入时,内存使用量最终达到超过1GB的峰值。如果问题存在,我需要找到一些方法来更正我的代

    2. ruby-on-rails - Rails 常用字符串(用于通知和错误信息等) - 2

      大约一年前,我决定确保每个包含非唯一文本的Flash通知都将从模块中的方法中获取文本。我这样做的最初原因是为了避免一遍又一遍地输入相同的字符串。如果我想更改措辞,我可以在一个地方轻松完成,而且一遍又一遍地重复同一件事而出现拼写错误的可能性也会降低。我最终得到的是这样的:moduleMessagesdefformat_error_messages(errors)errors.map{|attribute,message|"Error:#{attribute.to_s.titleize}#{message}."}enddeferror_message_could_not_find(obje

    3. ruby-on-rails - 如何从 format.xml 中删除 <hash></hash> - 2

      我有一个对象has_many应呈现为xml的子对象。这不是问题。我的问题是我创建了一个Hash包含此数据,就像解析器需要它一样。但是rails自动将整个文件包含在.........我需要摆脱type="array"和我该如何处理?我没有在文档中找到任何内容。 最佳答案 我遇到了同样的问题;这是我的XML:我在用这个:entries.to_xml将散列数据转换为XML,但这会将条目的数据包装到中所以我修改了:entries.to_xml(root:"Contacts")但这仍然将转换后的XML包装在“联系人”中,将我的XML代码修改为

    4. ruby - 我可以使用 Ruby 从 CSV 中删除列吗? - 2

      查看Ruby的CSV库的文档,我非常确定这是可能且简单的。我只需要使用Ruby删除CSV文件的前三列,但我没有成功运行它。 最佳答案 csv_table=CSV.read(file_path_in,:headers=>true)csv_table.delete("header_name")csv_table.to_csv#=>ThenewCSVinstringformat检查CSV::Table文档:http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV/Table.html

    5. ruby - 如何模拟 Net::HTTP::Post? - 2

      是的,我知道最好使用webmock,但我想知道如何在RSpec中模拟此方法:defmethod_to_testurl=URI.parseurireq=Net::HTTP::Post.newurl.pathres=Net::HTTP.start(url.host,url.port)do|http|http.requestreq,foo:1endresend这是RSpec:let(:uri){'http://example.com'}specify'HTTPcall'dohttp=mock:httpNet::HTTP.stub!(:start).and_yieldhttphttp.shou

    6. ruby - 我可以使用 aws-sdk-ruby 在 AWS S3 上使用事务性文件删除/上传吗? - 2

      我发现ActiveRecord::Base.transaction在复杂方法中非常有效。我想知道是否可以在如下事务中从AWSS3上传/删除文件:S3Object.transactiondo#writeintofiles#raiseanexceptionend引发异常后,每个操作都应在S3上回滚。S3Object这可能吗?? 最佳答案 虽然S3API具有批量删除功能,但它不支持事务,因为每个删除操作都可以独立于其他操作成功/失败。该API不提供任何批量上传功能(通过PUT或POST),因此每个上传操作都是通过一个独立的API调用完成的

    7. Ruby Sinatra 配置用于生产和开发 - 2

      我已经在Sinatra上创建了应用程序,它代表了一个简单的API。我想在生产和开发上进行部署。我想在部署时选择,是开发还是生产,一些方法的逻辑应该改变,这取决于部署类型。是否有任何想法,如何完成以及解决此问题的一些示例。例子:我有代码get'/api/test'doreturn"Itisdev"end但是在部署到生产环境之后我想在运行/api/test之后看到ItisPROD如何实现? 最佳答案 根据SinatraDocumentation:EnvironmentscanbesetthroughtheRACK_ENVenvironm

    8. c# - 如何在 ruby​​ 中调用 C# dll? - 2

      如何在ruby​​中调用C#dll? 最佳答案 我能想到几种可能性:为您的DLL编写(或找人编写)一个COM包装器,如果它还没有,则使用Ruby的WIN32OLE库来调用它;看看RubyCLR,其中一位作者是JohnLam,他继续在Microsoft从事IronRuby方面的工作。(估计不会再维护了,可能不支持.Net2.0以上的版本);正如其他地方已经提到的,看看使用IronRuby,如果这是您的技术选择。有一个主题是here.请注意,最后一篇文章实际上来自JohnLam(看起来像是2009年3月),他似乎很自在地断言RubyCL

    9. C# 到 Ruby sha1 base64 编码 - 2

      我正在尝试在Ruby中复制Convert.ToBase64String()行为。这是我的C#代码:varsha1=newSHA1CryptoServiceProvider();varpasswordBytes=Encoding.UTF8.GetBytes("password");varpasswordHash=sha1.ComputeHash(passwordBytes);returnConvert.ToBase64String(passwordHash);//returns"W6ph5Mm5Pz8GgiULbPgzG37mj9g="当我在Ruby中尝试同样的事情时,我得到了相同sha

    10. ruby - Net::HTTP 获取源代码和状态 - 2

      我目前正在使用以下方法获取页面的源代码:Net::HTTP.get(URI.parse(page.url))我还想获取HTTP状态,而无需发出第二个请求。有没有办法用另一种方法做到这一点?我一直在查看文档,但似乎找不到我要找的东西。 最佳答案 在我看来,除非您需要一些真正的低级访问或控制,否则最好使用Ruby的内置Open::URI模块:require'open-uri'io=open('http://www.example.org/')#=>#body=io.read[0,50]#=>"["200","OK"]io.base_ur

    随机推荐