#include <category.h>

Public Member Functions | |
| virtual long | append (const char *dt, const char *df, const uint32_t nold, const uint32_t nnew, const uint32_t nbuf, char *buf) |
Append the data file stored in directory df to the corresponding data file in directory dt. | |
| virtual double | estimateCost (const ibis::qDiscreteRange &cmp) const |
| virtual double | estimateCost (const ibis::qContinuousRange &cmp) const |
| Estimate the cost of evaluate the query expression. | |
| virtual double | estimateCost (const ibis::qMultiString &cmp) const |
| virtual double | estimateCost (const ibis::qString &cmp) const |
| virtual const char * | findString (const char *str) const |
| If the input string is found in the data file, it is returned, else this function returns 0. | |
| virtual void | getString (uint32_t i, std::string &val) const |
Return the string value for the ith row. | |
| const column * | IDColumnForKeywordIndex () const |
| virtual long | keywordSearch (const char *str) const |
| virtual long | keywordSearch (const char *str, ibis::bitvector &hits) const |
| virtual void | print (std::ostream &out) const |
| virtual long | search (const std::vector< std::string > &strs) const |
| virtual long | search (const char *str) const |
| virtual long | search (const std::vector< std::string > &strs, ibis::bitvector &hits) const |
| Given a group of string literals, return a bitvector that matches anyone of the input strings. | |
| virtual long | search (const char *str, ibis::bitvector &hits) const |
| Given a string literal, return a bitvector that marks the strings that matches it. | |
|
virtual std::vector < std::string > * | selectStrings (const bitvector &mask) const |
| virtual array_t< uint32_t > * | selectUInts (const bitvector &mask) const |
| Return the integer values of the records marked 1 in the mask. | |
| text (const ibis::column &col) | |
| text (const part *tbl, const char *name, ibis::TYPE_T t=ibis::TEXT) | |
| text (const part *tbl, FILE *file) | |
| virtual void | write (FILE *file) const |
| Write the current content to the TDC file. | |
Protected Member Functions | |
| int | readString (std::string &, int, long, long, char *, uint32_t, uint32_t &, off_t &) const |
| Read one string from an open file. | |
| void | readString (uint32_t i, std::string &val) const |
Read the string value of ith row. | |
| void | startPositions (const char *dir, char *buf, uint32_t nbuf) const |
| Locate the starting position of each string and write the positions as unsigned integers to a file with .sp as extension. | |
The keyword search operation is implemented through a boolean term-document matrix (ibis::keywords) that is actually generated externally.
| long ibis::text::append | ( | const char * | dt, | |
| const char * | df, | |||
| const uint32_t | nold, | |||
| const uint32_t | nnew, | |||
| const uint32_t | nbuf, | |||
| char * | buf | |||
| ) | [virtual] |
Append the data file stored in directory df to the corresponding data file in directory dt.
Use the buffer buf to copy data in large chuncks.
Does not check for missing entries. May cuase records to be misaligned.
Reimplemented from ibis::column.
Reimplemented in ibis::category.
References ibis::gVerbose, and startPositions().
| const char * ibis::text::findString | ( | const char * | str | ) | const [virtual] |
If the input string is found in the data file, it is returned, else this function returns 0.
It needs to keep both the data file and the starting position file open at the same time.
Reimplemented from ibis::column.
References ibis::util::buffer< T >::address(), ibis::part::currentDataDir(), ibis::gVerbose, ibis::fileManager::instance(), ibis::part::name(), ibis::part::nRows(), ibis::fileManager::recordPages(), ibis::util::buffer< T >::size(), and startPositions().
| virtual void ibis::text::getString | ( | uint32_t | i, | |
| std::string & | val | |||
| ) | const [inline, virtual] |
Return the string value for the ith row.
Only valid for ibis::text and ibis::category. ibis::text
Reimplemented from ibis::column.
Reimplemented in ibis::category.
References readString().
| int ibis::text::readString | ( | std::string & | res, | |
| int | fdes, | |||
| long | be, | |||
| long | en, | |||
| char * | buf, | |||
| uint32_t | nbuf, | |||
| uint32_t & | inbuf, | |||
| off_t & | boffset | |||
| ) | const [protected] |
Read one string from an open file.
The string starts at position be and ends at en. The content may be in the array buf.
References ibis::gVerbose.
| void ibis::text::readString | ( | uint32_t | i, | |
| std::string & | ret | |||
| ) | const [protected] |
Read the string value of ith row.
It goes through a two-stage process by reading from two files, first from the .sp file to read the position of the string in the second file and the second file contains the actual string values (with nil terminators).
This can be quite slow!
References ibis::part::currentDataDir(), ibis::fileManager::instance(), ibis::part::nRows(), and ibis::fileManager::recordPages().
Referenced by getString().
| long ibis::text::search | ( | const std::vector< std::string > & | strs, | |
| ibis::bitvector & | hits | |||
| ) | const [virtual] |
Given a group of string literals, return a bitvector that matches anyone of the input strings.
Reimplemented in ibis::category.
References ibis::util::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::bitvector::cnt(), ibis::part::currentDataDir(), ibis::gVerbose, ibis::fileManager::instance(), ibis::part::nRows(), ibis::fileManager::recordPages(), search(), ibis::bitvector::set(), ibis::bitvector::setBit(), ibis::bitvector::size(), ibis::util::buffer< T >::size(), and startPositions().
| long ibis::text::search | ( | const char * | str, | |
| ibis::bitvector & | hits | |||
| ) | const [virtual] |
Given a string literal, return a bitvector that marks the strings that matches it.
Reimplemented in ibis::category.
References ibis::util::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::bitvector::cnt(), ibis::part::currentDataDir(), ibis::gVerbose, ibis::fileManager::instance(), ibis::part::nRows(), ibis::fileManager::recordPages(), ibis::bitvector::setBit(), ibis::bitvector::size(), ibis::util::buffer< T >::size(), and startPositions().
Referenced by ibis::part::lookforString(), and search().
Return the integer values of the records marked 1 in the mask.
Return the positions of the bits that are marked 1.
This indicates to ibis::bundle that every string value is distinct. It also forces the sorting procedure to produce an order following the order of the entries in the table. This makes the print out of an ibis::text field quite less useful than others!
Reimplemented from ibis::column.
Reimplemented in ibis::category.
References ibis::bitvector::firstIndexSet(), and array_t< T >::push_back().
| void ibis::text::startPositions | ( | const char * | dir, | |
| char * | buf, | |||
| uint32_t | nbuf | |||
| ) | const [protected] |
Locate the starting position of each string and write the positions as unsigned integers to a file with .sp as extension.
Using the data file located in the named directory dir.
If dir is a nil pointer, the directory defaults to the current working directory of the data partition.
Argument buf (with nbuf bytes) is used as temporary work space. If nbuf = 0, this function allocates its own working space.
References ibis::util::buffer< T >::address(), ibis::part::currentDataDir(), ibis::gVerbose, ibis::part::nRows(), and ibis::util::buffer< T >::size().
Referenced by append(), findString(), and search().
|
|