Hbase coprocessor missing

HBASE COPROCESSOR MISSING FULL

If your queries suddenly started doing much worse than before, check for potential bugs in your application code. Test results are recorded accurately and systematically. To do this, check the CPU usage on client machines.Ĭlient-side configurations, like the number of threads, are tuned appropriately to saturate client bandwidth.

The client machines don't become a bottleneck. If you're using benchmarks such as Yahoo! Cloud Serving Benchmark, JMeter, or Pherf to test and tune performance, make sure that: Expected latency for scans averages approximately 100 milliseconds, compared to 10 milliseconds for point gets in HBase.

If so, they would also be doing scans.

Have you used the EXPLAIN statement to understand the query plans your "reads" generate?.

Have you optimized your Phoenix table schema for these scans including appropriate indexing?.

If so, what are the characteristics of these scans?.

Are all your "reads" translating to scans?.

Apache Phoenix workloadĪnswering the following questions will help you understand your Apache Phoenix workload better: Check if major compaction has been scheduled accurately. If they don't do this maintenance, compaction will adversely affect read performance in the long run.įor scan operations, mean latencies that are much higher than 100 milliseconds should be a cause for concern. Also, data locality isn't a concern because our storage is remote (backed by Azure Storage) instead of to a local Hadoop Distributed File System (HDFS).Ĭustomers should schedule major compaction at their convenience. For example, they might schedule it during off-peak hours.

HBASE COPROCESSOR MISSING FULL

Compaction is disabled because, even though it is a resource-intensive process, customers have full flexibility to schedule it according to their workloads. By default, major compaction is disabled on HDInsight HBase clusters. CompactionĬompaction is another potential bottleneck that is fundamentally agreed upon in the community. This option is possible only if the WAL feature is enabled. To gain significant improvement in read operations, use Premium Block Blob Storage Account as your remote storage. This tremendously benefits write performance, and it helps many issues faced by some of the write-intensive workloads. It writes the WAL to Azure Premium SSD-managed disks. The Accelerated Writes feature is designed to solve this problem. In HDInsight, this behavior amplified this bottleneck. Until recently, the WAL was also written to Azure Storage. Data is stored remotely on Azure Storage, even though virtual machines host the region servers. HDInsight HBase has a separated storage-compute model. The top bottleneck in most HBase workloads is the Write Ahead Log (WAL). Before you apply configuration changes in a production environment, test them thoroughly.

Many of these tips depend on the particular workload and read/write/scan pattern.

This article describes various Apache HBase performance tuning guidelines and tips for getting optimal performance on Azure HDInsight.