代码拉取完成,页面将自动刷新
ColumnFamilyOptions::sample_for_compression
now takes effect for creation of all block-based tables. Previously it only took effect for block-based tables created by flush.CompactFiles()
can no longer compact files from lower level to up level, which has the risk to corrupt DB (details: #8063). The validation is also added to all compactions.strerror_r()
to get error messages.Env
has high-pri thread pool disabled (Env::GetBackgroundThreads(Env::Priority::HIGH) == 0
)yield
instead of wfe
to relax cpu to gain better performance.TableProperties::slow_compression_estimated_data_size
and TableProperties::fast_compression_estimated_data_size
. When ColumnFamilyOptions::sample_for_compression > 0
, they estimate what TableProperties::data_size
would have been if the "fast" or "slow" (see ColumnFamilyOptions::sample_for_compression
API doc for definitions) compression had been used instead.delayed_write_rate
is actually exceeded, with an initial burst allowance of 1 millisecond worth of bytes. Also, beyond the initial burst allowance, delayed_write_rate
is now more strictly enforced, especially with multiple column families.BackupableDBOptions::share_files_with_checksum
to true
and deprecated false
because of potential for data loss. Note that accepting this change in behavior can temporarily increase backup data usage because files are not shared between backups using the two different settings. Also removed obsolete option kFlagMatchInterimNaming.FilterBlobByKey()
to CompactionFilter
. Subclasses can override this method so that compaction filters can determine whether the actual blob value has to be read during compaction. Use a new kUndetermined
in CompactionFilter::Decision
to indicated that further action is necessary for compaction filter to make a decision.WriteBatch
through the write to RocksDB's in-memory update buffer (memtable). This is intended to detect some cases of in-memory data corruption, due to either software or hardware errors. Users can enable protection by constructing their WriteBatch
with protection_bytes_per_key == 8
.full_history_ts_low
option in manual compaction, which is for old timestamp data GC.FileSystem
s) whose source code resides outside the RocksDB repo. See "plugin/README.md" for developer details, and "PLUGINS.md" for a listing of available plugins.rocksdb::DB
API, as opposed to the separate rocksdb::blob_db::BlobDB
interface used by the earlier version, and can be configured on a per-column family basis using the configuration options enable_blob_files
, min_blob_size
, blob_file_size
, blob_compression_type
, enable_blob_garbage_collection
, and blob_garbage_collection_age_cutoff
. It extends RocksDB's consistency guarantees to blobs, and offers more features and better performance. Note that some features, most notably Merge
, compaction filters, and backup/restore are not yet supported, and there is no support for migrating a database created by the old implementation.TransactionDB
returns error Status
es from calls to DeleteRange()
and calls to Write()
where the WriteBatch
contains a range deletion. Previously such operations may have succeeded while not providing the expected transactional guarantees. There are certain cases where range deletion can still be used on such DBs; see the API doc on TransactionDB::DeleteRange()
for details.OptimisticTransactionDB
now returns error Status
es from calls to DeleteRange()
and calls to Write()
where the WriteBatch
contains a range deletion. Previously such operations may have succeeded while not providing the expected transactional guarantees.WRITE_PREPARED
, WRITE_UNPREPARED
TransactionDB MultiGet()
may return uncommitted data with snapshot.CompressionOptions::max_dict_buffer_bytes
, to limit the in-memory buffering for selecting samples for generating/training a dictionary. The limit is currently loosely adhered to.DB::VerifyFileChecksums()
, we now fail with Status::InvalidArgument
if the name of the checksum generator used for verification does not match the name of the checksum generator used for protecting the file when it was created.ErrorHandler::SetBGError
.WalAddition
and WalDeletion
, fixed this by changing the encoded format of them to be ignorable by older versions.merge_operator
now fails immediately, causing the DB to enter read-only mode. Previously, failure was deferred until the merge_operator
was needed by a user read or a background operation.WALRecoveryMode::kPointInTimeRecovery
is used. Gaps are still possible when WALs are truncated exactly on record boundaries; for complete protection, users should enable track_and_verify_wals_in_manifest
.read_amp_bytes_per_bit
during OPTIONS file parsing on big-endian architecture. Without this fix, original code introduced in PR7659, when running on big-endian machine, can mistakenly store read_amp_bytes_per_bit (an uint32) in little endian format. Future access to read_amp_bytes_per_bit
will give wrong values. Little endian architecture is not affected.CompactRange
and GetApproximateSizes
.Env::GetChildren
and Env::GetChildrenFileAttributes
will no longer return entries for the special directories .
or ..
.track_and_verify_wals_in_manifest
. If true
, the log numbers and sizes of the synced WALs are tracked in MANIFEST, then during DB recovery, if a synced WAL is missing from disk, or the WAL's size does not match the recorded size in MANIFEST, an error will be reported and the recovery will be aborted. Note that this option does not work with secondary instance.rocksdb_approximate_sizes
and rocksdb_approximate_sizes_cf
in the C API now requires an error pointer (char** errptr
) for receiving any error.format_version >= 3
), indexes are partitioned (index_type == kTwoLevelIndexSearch
), and some index partitions are pinned in memory (BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache
). The bug could cause keys to be truncated when read from the index leading to wrong read results or other unexpected behavior.index_type == kTwoLevelIndexSearch
), some index partitions are pinned in memory (BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache
), and partitions reads could be mixed between block cache and directly from the file (e.g., with enable_index_compression == 1
and mmap_read == 1
, partitions that were stored uncompressed due to poor compression ratio would be read directly from the file via mmap, while partitions that were stored compressed would be read from block cache). The bug could cause index partitions to be mistakenly considered empty during reads leading to wrong read results.Status::Corruption
failure when paranoid_file_checks == true
and range tombstones were written to the compaction output files.WriteOptions.no_slowdown=true
).ignore_unknown_options
flag (used in option parsing/loading functions) changed.NotFound
instead of InvalidArgument
for option names not available in the present version.TableBuilder::NeedCompact()
before TableBuilder::Finish()
in compaction job. For example, the NeedCompact()
method of CompactOnDeletionCollector
returned by built-in CompactOnDeletionCollectorFactory
requires BlockBasedTable::Finish()
to return the correct result. The bug can cause a compaction-generated file not to be marked for future compaction based on deletion ratio.BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache
and BlockBasedTableOptions::pin_top_level_index_and_filter
. These options still take effect until users migrate to the replacement APIs in BlockBasedTableOptions::metadata_cache_options
. Migration guidance can be found in the API comments on the deprecated options.DB::VerifyFileChecksums
to verify SST file checksum with corresponding entries in the MANIFEST if present. Current implementation requires scanning and recomputing file checksums.ColumnFamilyOptions::compression_opts
now additionally affect files generated by flush and compaction to non-bottommost level. Previously those settings at most affected files generated by compaction to bottommost level, depending on whether ColumnFamilyOptions::bottommost_compression_opts
overrode them. Users who relied on dictionary compression settings in ColumnFamilyOptions::compression_opts
affecting only the bottommost level can keep the behavior by moving their dictionary settings to ColumnFamilyOptions::bottommost_compression_opts
and setting its enabled
flag.enabled
flag is set in ColumnFamilyOptions::bottommost_compression_opts
, those compression options now take effect regardless of the value in ColumnFamilyOptions::bottommost_compression
. Previously, those compression options only took effect when ColumnFamilyOptions::bottommost_compression != kDisableCompressionOption
. Now, they additionally take effect when ColumnFamilyOptions::bottommost_compression == kDisableCompressionOption
(such a setting causes bottommost compression type to fall back to ColumnFamilyOptions::compression_per_level
if configured, and otherwise fall back to ColumnFamilyOptions::compression
).CompactRange()
with CompactRangeOptions::change_level
set fails due to a conflict in the level change step, which caused all subsequent calls to CompactRange()
with CompactRangeOptions::change_level
set to incorrectly fail with a Status::NotSupported("another thread is refitting")
error.BottommostLevelCompaction.kForce
or kForceOptimized
is set.CompactRange()
for refitting levels (CompactRangeOptions::change_level == true
) and another manual compaction are executed in parallel.recycle_log_file_num
to zero when the user attempts to enable it in combination with WALRecoveryMode::kTolerateCorruptedTailRecords
. Previously the two features were allowed together, which compromised the user's configured crash-recovery guarantees.SstFileWriter
. Previously, the dictionary would be trained/finalized immediately with zero samples. Now, the whole SstFileWriter
file is buffered in memory and then sampled.avoid_unnecessary_blocking_io=1
and creating backups (BackupEngine::CreateNewBackup) or checkpoints (Checkpoint::Create). With this setting and WAL enabled, these operations could randomly fail with non-OK status.std::string requested_checksum_func_name
is added to FileChecksumGenContext
, which enables the checksum factory to create generators for a suite of different functions.ldb unsafe_remove_sst_file
, which removes a lost or corrupt SST file from a DB's metadata. This command involves data loss and must not be used on a live DB.kCompactionStyleLevel
compaction style with level_compaction_dynamic_level_bytes
set.share_files_with_checksum
is used with kLegacyCrc32cAndFileSize
naming (discouraged).
share_files_with_checksum
, we are confident there is no regression (vs. pre-6.12) in detecting DB or backup corruption at backup creation time, mostly because the old design did not leverage this extra checksum computation for detecting inconsistencies at backup creation time.share_table_files
without "checksum" (not recommended), there is a regression in detecting fundamentally unsafe use of the option, greatly mitigated by file size checking (under "Behavior Changes"). Almost no reason to use share_files_with_checksum=false
should remain.DB::VerifyChecksum
and BackupEngine::VerifyBackup
with checksum checking are still able to catch corruptions that CreateNewBackup
does not.DB::DeleteFile()
API describing its known problems and deprecation plan.FSRandomAccessFile.Prefetch()
default return status is changed from OK
to NotSupported
. If the user inherited file doesn't implement prefetch, RocksDB will create internal prefetch buffer to improve read performance.EventListener
in listener.h contains new callback functions: OnFileFlushFinish()
, OnFileSyncFinish()
, OnFileRangeSyncFinish()
, OnFileTruncateFinish()
, and OnFileCloseFinish()
.FileOperationInfo
now reports duration
measured by std::chrono::steady_clock
and start_ts
measured by std::chrono::system_clock
instead of start and finish timestamps measured by system_clock
. Note that system_clock
is called before steady_clock
in program order at operation starts.DB::GetDbSessionId(std::string& session_id)
is added. session_id
stores a unique identifier that gets reset every time the DB is opened. This DB session ID should be unique among all open DB instances on all hosts, and should be unique among re-openings of the same or other DBs. This identifier is recorded in the LOG file on the line starting with "DB Session ID:".DB::OpenForReadOnly()
now returns Status::NotFound
when the specified DB directory does not exist. Previously the error returned depended on the underlying Env
. This change is available in all 6.11 releases as well.verify_with_checksum
is added to BackupEngine::VerifyBackup
, which is false by default. If it is ture, BackupEngine::VerifyBackup
verifies checksums and file sizes of backup files. Pass false
for verify_with_checksum
to maintain the previous behavior and performance of BackupEngine::VerifyBackup
, by only verifying sizes of backup files.file_checksum_gen_factory
is set to GetFileChecksumGenCrc32cFactory()
, BackupEngine will compare the crc32c checksums of table files computed when creating a backup to the expected checksums stored in the DB manifest, and will fail CreateNewBackup()
on mismatch (corruption). If the file_checksum_gen_factory
is not set or set to any other customized factory, there is no checksum verification to detect if SST files in a DB are corrupt when read, copied, and independently checksummed by BackupEngine.stats_dump_period_sec > 0
, either as the initial value for DB open or as a dynamic option change, the first stats dump is staggered in the following X seconds, where X is an integer in [0, stats_dump_period_sec)
. Subsequent stats dumps are still spaced stats_dump_period_sec
seconds apart.db_id
) and DB session identity (db_session_id
) are added to table properties and stored in SST files. SST files generated from SstFileWriter and Repairer have DB identity “SST Writer” and “DB Repairer”, respectively. Their DB session IDs are generated in the same way as DB::GetDbSessionId
. The session ID for SstFileWriter (resp., Repairer) resets every time SstFileWriter::Open
(resp., Repairer::Run
) is called.BackupableDBOptions::share_files_with_checksum_naming
is added with new default behavior for naming backup files with share_files_with_checksum
, to address performance and backup integrity issues. See API comments for details.max_subcompactions
can be set dynamically using DB::SetDBOptions().shared_checksum
directory when using share_files_with_checksum_naming = kUseDbSessionId
(new default), except on SST files generated before this version of RocksDB, which fall back on using kLegacyCrc32cAndFileSize
.Status::InvalidArgument
if the range's end key comes before its start key according to the user comparator. Previously the behavior was undefined.force_consistency_checks
is false
.pin_l0_filter_and_index_blocks_in_cache
no longer applies to L0 files larger than 1.5 * write_buffer_size
to give more predictable memory usage. Such L0 files may exist due to intra-L0 compaction, external file ingestion, or user dynamically changing write_buffer_size
(note, however, that files that are already pinned will continue being pinned, even after such a dynamic change).Env::LowerThreadPoolCPUPriority(Priority, CpuPriority)
is added to Env
to be able to lower to a specific priority such as CpuPriority::kIdle
.BlockBasedTableBuilder
. This optimization makes block building, block compression and block appending a pipeline, and uses multiple threads to accelerate block compression. Users can set CompressionOptions::parallel_threads
greater than 1 to enable compression parallelism. This feature is experimental for now.max_background_flushes
can be set dynamically using DB::SetDBOptions().--compression_level_from
and --compression_level_to
to report size of all compression levels and one compression_type must be specified with it so that it will report compressed sizes of one compression type with different levels.PerfContext::user_key_comparison_count
for lookups in files written with format_version >= 3
.COMMITTED
, while the old misspelled COMMITED
is still available as an alias.CreateBackupOptions
is added to both BackupEngine::CreateNewBackup
and BackupEngine::CreateNewBackupWithMetadata
, you can decrease CPU priority of BackupEngine
's background threads by setting decrease_background_thread_cpu_priority
and background_thread_cpu_priority
in CreateBackupOptions
.WriteBatchWithIndex::DeleteRange
returns Status::NotSupported
. Previously it returned success even though reads on the batch did not account for range tombstones. The corresponding language bindings now cannot be used. In C, that includes rocksdb_writebatch_wi_delete_range
, rocksdb_writebatch_wi_delete_range_cf
, rocksdb_writebatch_wi_delete_rangev
, and rocksdb_writebatch_wi_delete_rangev_cf
. In Java, that includes WriteBatchWithIndex::deleteRange
.BLOB_DB_GC_NUM_FILES
(number of blob files obsoleted during GC), BLOB_DB_GC_NUM_NEW_FILES
(number of new blob files generated during GC), BLOB_DB_GC_FAILURES
(number of failed GC passes), BLOB_DB_GC_NUM_KEYS_RELOCATED
(number of blobs relocated during GC), and BLOB_DB_GC_BYTES_RELOCATED
(total size of blobs relocated during GC). On the other hand, the following statistics, which are not relevant for the new GC implementation, are now deprecated: BLOB_DB_GC_NUM_KEYS_OVERWRITTEN
, BLOB_DB_GC_NUM_KEYS_EXPIRED
, BLOB_DB_GC_BYTES_OVERWRITTEN
, BLOB_DB_GC_BYTES_EXPIRED
, and BLOB_DB_GC_MICROS
.db_bench
now supports value_size_distribution_type
, value_size_min
, value_size_max
options for generating random variable sized value. Added blob_db_compression_type
option for BlobDB to enable blob compression.OptimisticTransactionDBOptions
Option that allows users to configure occ validation policy. The default policy changes from kValidateSerial to kValidateParallel to reduce mutex contention.max_background_jobs
dynamically through the SetDBOptions
interface.enable_garbage_collection
is set to true
in BlobDBOptions
. Garbage collection is performed during compaction: any valid blobs located in the oldest N files (where N is the number of non-TTL blob files multiplied by the value of BlobDBOptions::garbage_collection_cutoff
) encountered during compaction get relocated to new blob files, and old blob files are dropped once they are no longer needed. Note: we recommend enabling periodic compactions for the base DB when using this feature to deal with the case when some old blob files are kept alive by SSTs that otherwise do not get picked for compaction.db_bench
now supports the garbage_collection_cutoff
option for BlobDB.creation_time
of new compaction outputs.ColumnFamilyHandle
pointers themselves instead of only the column family IDs when checking whether an API call uses the default column family or not.GetLiveFilesMetaData
and GetColumnFamilyMetaData
now expose the file number of SST files as well as the oldest blob file referenced by each SST.sst_dump
command line tool recompress
command now displays how many blocks were compressed and how many were not, in particular how many were not compressed because the compression ratio was not met (12.5% threshold for GoodCompressionRatio), as seen in the number.block.not_compressed
counter stat since version 6.0.0.db_bench
now supports and by default issues non-TTL Puts to BlobDB. TTL Puts can be enabled by specifying a non-zero value for the blob_db_max_ttl_range
command line parameter explicitly.sst_dump
now supports printing BlobDB blob indexes in a human-readable format. This can be enabled by specifying the decode_blob_index
flag on the command line.creation_time
table property for compaction output files is now set to the minimum of the creation times of all compaction inputs.LevelAndStyleCustomFilterPolicy
in db_bloom_filter_test.cc. While most existing custom implementations of FilterPolicy should continue to work as before, those wrapping the return of NewBloomFilterPolicy will require overriding new function GetBuilderWithContext()
, because calling GetFilterBitsBuilder()
on the FilterPolicy returned by NewBloomFilterPolicy is no longer supported.snap_refresh_nanos
option.UINT64_MAX - 1
which allows RocksDB to auto-tune periodic compaction scheduling. When using the default value, periodic compactions are now auto-enabled if a compaction filter is used. A value of 0
will turn off the feature completely.UINT64_MAX - 1
which allows RocksDB to auto-tune ttl value. When using the default value, TTL will be auto-enabled to 30 days, when the feature is supported. To revert the old behavior, you can explicitly set it to 0.snap_refresh_nanos
is set to 0..memtable_insert_hint_per_batch
to WriteOptions. If it is true, each WriteBatch will maintain its own insert hints for each memtable in concurrent write. See include/rocksdb/options.h for more details.--secondary_path
to ldb to open the database as the secondary instance. This would keep the original DB intact.RegisterCustomObjects
function. By linking the unit test binary with the static library, the unit test can execute this function.snap_refresh_nanos
(default to 0) to periodically refresh the snapshot list in compaction jobs. Assign to 0 to disable the feature.unordered_write
which trades snapshot guarantees with higher write throughput. When used with WRITE_PREPARED transactions with two_write_queues=true, it offers higher throughput with however no compromise on guarantees.failed_move_fall_back_to_copy
(default is true) for external SST ingestion. When move_files
is true and hard link fails, ingestion falls back to copy if failed_move_fall_back_to_copy
is true. Otherwise, ingestion reports an error.list_file_range_deletes
in ldb, which prints out tombstones in SST files.Put
s covered by range tombstones to reappear. Note Put
s may exist even if the user only ever called Merge()
due to an internal conversion during compaction to the bottommost level.strict_bytes_per_sync
that causes a file-writing thread to block rather than exceed the limit on bytes pending writeback specified by bytes_per_sync
or wal_bytes_per_sync
.IsFlushPending() == true
caused by one bg thread releasing the db mutex in ~ColumnFamilyData and another thread clearing flush_requested_
flag.cache_index_and_filter_blocks == true
, we now store dictionary data used for decompression in the block cache for better control over memory usage. For users of ZSTD v1.1.4+ who compile with -DZSTD_STATIC_LINKING_ONLY, this includes a digested dictionary, which is used to increase decompression speed.GetStatsHistory
API to retrieve these snapshots.SstFileWriter
will now use dictionary compression if it is configured in the file writer's CompressionOptions
.TableProperties::num_entries
and TableProperties::num_deletions
now also account for number of range tombstones.number.block.not_compressed
now also counts blocks not compressed due to poor compression ratio.CompactionOptionsFIFO
. The option has been deprecated and ttl in ColumnFamilyOptions
is used instead.NotFound
point lookup result when querying the endpoint of a file that has been extended by a range tombstone.JemallocNodumpAllocator
memory allocator. When being use, block cache will be excluded from core dump.PerfContextByLevel
as part of PerfContext
which allows storing perf context at each level. Also replaced __thread
with thread_local
keyword for perf_context. Added per-level perf context for bloom filter and Get
query.atomic_flush
. If true, RocksDB supports flushing multiple column families and atomically committing the result to MANIFEST. Useful when WAL is disabled.num_deletions
and num_merge_operands
members to TableProperties
.MemoryAllocator
, which lets the user specify custom memory allocator for block based table.DeleteRange
to prevent read performance degradation. The feature is no longer marked as experimental.DBOptions::use_direct_reads
now affects reads issued by BackupEngine
on the database's SSTs.NO_ITERATORS
is divided into two counters NO_ITERATOR_CREATED
and NO_ITERATOR_DELETE
. Both of them are only increasing now, just as other counters.NO_FILE_CLOSES
ticker statistic, which was always zero previously.OnTableFileCreated
will now be called for empty files generated during compaction. In that case, TableFileCreationInfo::file_path
will be "(nil)" and TableFileCreationInfo::file_size
will be zero.FlushOptions::allow_write_stall
, which controls whether Flush calls start working immediately, even if it causes user writes to stall, or will wait until flush can be performed without causing write stall (similar to CompactRangeOptions::allow_write_stall
). Note that the default value is false, meaning we add delay to Flush calls until stalling can be avoided when possible. This is behavior change compared to previous RocksDB versions, where Flush calls didn't check if they might cause stall or not.OnCompactionCompleted
.CompactFiles
run with CompactionOptions::compression == CompressionType::kDisableCompressionOption
. Now that setting causes the compression type to be chosen according to the column family-wide compression options.MergeOperator::ShouldMerge
in the reversed order relative to how they were merged (passed to FullMerge or FullMergeV2) for performance reasonsmax_num_ikeys
.CompressionOptions::zstd_max_train_bytes
to a nonzero value) now requires ZSTD version 1.1.3 or later.bottommost_compression_opts
. To keep backward compatible, a new boolean enabled
is added to CompressionOptions. For compression_opts, it will be always used no matter what value of enabled
is. For bottommost_compression_opts, it will only be used when user set enabled=true
, otherwise, compression_opts will be used for bottommost_compression as default.Statistics
objects created via CreateDBStatistics()
, the format of the string returned by its ToString()
method has changed.ColumnFamilyOptions::ttl
via SetOptions()
.bytes_max_delete_chunk
to 0 in NewSstFileManager() as it doesn't work well with checkpoints.DBOptions::use_direct_io_for_flush_and_compaction
only applies to background writes, and DBOptions::use_direct_reads
applies to both user reads and background reads. This conforms with Linux's open(2)
manpage, which advises against simultaneously reading a file in buffered and direct modes, due to possibly undefined behavior and degraded performance.CompressionOptions::kDefaultCompressionLevel
, which is a generic way to tell RocksDB to use the compression library's default level. It is now the default value for CompressionOptions::level
. Previously the level defaulted to -1, which gave poor compression ratios in ZSTD.Env::LowerThreadPoolCPUPriority(Priority)
method, which lowers the CPU priority of background (esp. compaction) threads to minimize interference with foreground tasks.Env::SetBackgroundThreads()
, compactions to the bottom level will be delegated to that thread pool.prefix_extractor
has been moved from ImmutableCFOptions to MutableCFOptions, meaning it can be dynamically changed without a DB restart.BackupableDBOptions::max_valid_backups_to_open
to not delete backup files when refcount cannot be accurately determined.BlockBasedTableConfig.setBlockCache
to allow sharing a block cache across DB instances.ignore_unknown_options
argument will only be effective if the option file shows it is generated using a higher version of RocksDB than the current version.CompactRange()
when the range specified by the user does not overlap unflushed memtables.ColumnFamilyOptions::max_subcompactions
is set greater than one, we now parallelize large manual level-based compactions.include_end
option to make the range end exclusive when include_end == false
in DeleteFilesInRange()
.CompactRangeOptions::allow_write_stall
, which makes CompactRange
start working immediately, even if it causes user writes to stall. The default value is false, meaning we add delay to CompactRange
calls until stalling can be avoided when possible. Note this delay is not present in previous RocksDB versions.Status::InvalidArgument
; previously, it returned Status::IOError
.DeleteFilesInRanges()
to delete files in multiple ranges at once for better performance.DisableFileDeletions()
followed by GetSortedWalFiles()
to not return obsolete WAL files that PurgeObsoleteFiles()
is going to delete.autoTune
and getBytesPerSecond()
to RocksJava RateLimitermake
with environment variable USE_SSE
set and PORTABLE
unset, will use all machine features available locally. Previously this combination only compiled SSE-related features.NUMBER_ITER_SKIP
, which returns how many internal keys were skipped during iterations (e.g., due to being tombstones or duplicate versions of a key).key_lock_wait_count
and key_lock_wait_time
, which measure the number of times transactions wait on key locks and total amount of time waiting.IngestExternalFile()
affecting databases with large number of SST files.DeleteFilesInRange()
deletes a subset of files spanned by a DeleteRange()
marker.BackupableDBOptions::max_valid_backups_to_open == 0
now means no backups will be opened during BackupEngine initialization. Previously this condition disabled limiting backups opened.DBOptions::preserve_deletes
is a new option that allows one to specify that DB should not drop tombstones for regular deletes if they have sequence number larger than what was set by the new API call DB::SetPreserveDeletesSequenceNumber(SequenceNumber seqnum)
. Disabled by default.DB::SetPreserveDeletesSequenceNumber(SequenceNumber seqnum)
was added, users who wish to preserve deletes are expected to periodically call this function to advance the cutoff seqnum (all deletes made before this seqnum can be dropped by DB). It's user responsibility to figure out how to advance the seqnum in the way so the tombstones are kept for the desired period of time, yet are eventually processed in time and don't eat up too much space.ReadOptions::iter_start_seqnum
was added;
if set to something > 0 user will see 2 changes in iterators behavior 1) only keys written with sequence larger than this parameter would be returned and 2) the Slice
returned by iter->key() now points to the memory that keep User-oriented representation of the internal key, rather than user key. New struct FullKey
was added to represent internal keys, along with a new helper function ParseFullKey(const Slice& internal_key, FullKey* result);
.crc32c_3way
on supported platforms to improve performance. The system will choose to use this algorithm on supported platforms automatically whenever possible. If PCLMULQDQ is not supported it will fall back to the old Fast_CRC32 algorithm.DBOptions::writable_file_max_buffer_size
can now be changed dynamically.DBOptions::bytes_per_sync
, DBOptions::compaction_readahead_size
, and DBOptions::wal_bytes_per_sync
can now be changed dynamically, DBOptions::wal_bytes_per_sync
will flush all memtables and switch to a new WAL file.true
to the auto_tuned
parameter in NewGenericRateLimiter()
. The value passed as rate_bytes_per_sec
will still be respected as an upper-bound.ColumnFamilyOptions::compaction_options_fifo
.EventListener::OnStallConditionsChanged()
callback. Users can implement it to be notified when user writes are stalled, stopped, or resumed.ReadOptions::iterate_lower_bound
.DB:Open()
will abort if column family inconsistency is found during PIT recovery.DeleteRange()
.Statistics::getHistogramString()
will see fewer histogram buckets and different bucket endpoints.Slice::compare
and BytewiseComparator Compare
no longer accept Slice
s containing nullptr.Transaction::Get
and Transaction::GetForUpdate
variants with PinnableSlice
added.Env::SetBackgroundThreads(N, Env::Priority::BOTTOM)
, where N > 0
.MergeOperator::AllowSingleOperand
.DB::VerifyChecksum()
, which verifies the checksums in all SST files in a running DB.BlockBasedTableOptions::checksum = kNoChecksum
.rocksdb.db.get.micros
, rocksdb.db.write.micros
, and rocksdb.sst.read.micros
.EventListener::OnBackgroundError()
callback. Users can implement it to be notified of errors causing the DB to enter read-only mode, and optionally override them.DeleteRange()
is used together with subcompactions.max_background_flushes=0
. Instead, users can achieve this by configuring their high-pri thread pool to have zero threads.Options::max_background_flushes
, Options::max_background_compactions
, and Options::base_background_compactions
all with Options::max_background_jobs
, which automatically decides how many threads to allocate towards flush/compaction.IOStatsContext iostats_context
with IOStatsContext* get_iostats_context()
; replace global variable PerfContext perf_context
with PerfContext* get_perf_context()
.DB::IngestExternalFile()
now supports ingesting files into a database containing range deletions.max_open_files
option via SetDBOptions()GetAllKeyVersions
to see internal versions of a range of keys.allow_ingest_behind
stats_dump_period_sec
option via SetDBOptions().delete_obsolete_files_period_micros
option via SetDBOptions().delayed_write_rate
and max_total_wal_size
options via SetDBOptions().delayed_write_rate
option via SetDBOptions().const WriteEntry&
make rocksdbjavastatic
.StackableDB::GetRawDB()
to StackableDB::GetBaseDB()
.WriteBatch::Data()
const std::string& Data() const
.TableStats
to TableProperties
.PrefixHashRepFactory
. Please use NewHashSkipListRepFactory()
instead.EnableFileDeletions()
and DisableFileDeletions()
.DB::GetOptions()
.DB::GetDbIdentity()
.SliceParts
- Variant of Put()
that gathers output like writev(2)
Get()
-- 1fdb3f -- 1.5x QPS increase for some workloads此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。