banner



Can I Shutdown Cassandra Node During Repair

The DC/Bone Apache Cassandra service provides a robust API (attainable by HTTP or DC/Os CLI) for managing, repairing, and monitoring the service. Hither, only the CLI version is presented for conciseness, but see the API Reference for HTTP instructions.

Program Operations

An example of the DC/Bone Apache Cassandra service is ultimately a set of plans (how to run the service) and a gear up of pods (what to run). The service is driven primarily by 2 plans, deploy and recovery.

The deploy program is used during initial installation, as well equally during configuration and service updates. The recovery programme is an always-running program that ensures that the pods and tasks that should be running, stay running. Other plans ("sidecars") are used for secondary operations.

Every plan is made upwardly of a gear up of phases, each of which consists of one or more than steps.

Listing plans

A list of plans tin can exist retrieved using the CLI command:

          dcos cassandra --name=cassandra plan list                  

Inspecting a plan

View the electric current status of a program using the CLI command:

          dcos cassandra --proper noun=cassandra plan status <plan-name>                  

For example, the status of the completed deploy plan of the service will be:

          dcos cassandra --name=cassandra program status deploy deploy (serial strategy) (Complete) └─ node-deploy (serial strategy) (Complete)    ├─ node-0:[server] (Consummate)    ├─ node-0:[init_system_keyspaces] (COMPLETE)    ├─ node-1:[server] (COMPLETE)    └─ node-two:[server] (COMPLETE)                  

Note: Passing the --json flag to the condition command will render a JSON response which has UUIDs for the phases and steps of a plan. These UUIDs tin be useful in the other plan commands.

Operating on a plan

Start

Start a programme with the CLI command:

          dcos cassandra --name=cassandra program outset <plan-name>                  

Suspension

Pauses the programme, or a specific phase in that plan with the provided phase name (or UUID).

          dcos cassandra --proper name=cassandra plan interruption <programme (required)> <phase (optional)>                  

End

Stops the running plan with the provided name.

          dcos cassandra --name=cassandra plan stop <programme>                  

Program Cease differs from Programme Intermission in the following ways:

  • Pause can exist issued for a specific phase or for all phases within a plan. Stop can only exist issued for a plan.
  • Pause updates the underlying Phase/Step state. Cease both ceases execution and of the plan and resets the plan to its initial pending state.

Resume

Resumes the plan, or a specific phase in that program, with the provided stage proper name (or UUID).

          dcos cassandra --name=cassandra plan resume <programme (required)> <phase (optional)>                  

Force-Restart

Restarts the specified stride, phase if no step is specified, or plan if no phase is specified.

          dcos cassandra --name=cassandra force-restart <plan (required)> <phase (optional)> <stride (optional)>                  

Strength-Complete

Forcefulness completes a specific pace in the provided stage of the plan. From the CLI it is only possible to force complete a step. The HTTP API supports force completing on phases and plans in their entirety.

          dcos cassandra --name=cassandra force-consummate <plan (required)> <phase (required)> <step (required)>                  

Pod Operations

A deployed example of the DC/Os Apache Cassandra service is made up of a set of running pods. Using the pod API, it is possible to manage the lifecycle of these pods as well equally investigate failures of the pods and their tasks.

Annotation: Managing pod count is done via the service configuration and does not have a direct API.

Listing

To list all the pods of the service run the CLI command:

          dcos cassandra --proper noun=cassandra pod list                  

Condition

To view the status of all pods or optionally just one pod run the CLI command:

          dcos cassandra --name=cassandra pod status <pod-name (optional)>                  

This will show whatever condition overrides for pods and their tasks.

Restart

To restart a pod in place, utilise the CLI command:

          dcos cassandra --proper name=cassandra pod restart <pod-name>                  

This will kill the tasks of the pod, and relaunch them in-place. The progress of the restart can be monitored by watching the recovery program of the service.

Replace

Supplant should be used only when the current instance of the pod should be completely destroyed. All persistent data (read: volumes) of the pod will be destroyed. Replace should be used when a DC/OS agent is being removed, is permanently down, or pod placement constraints demand to exist updated.

Issue a supersede by running the CLI command:

          dcos cassandra --name=cassandra pod replace <pod-proper name>                  

Pause

Pausing a pod relaunches it in an idle control state. This allows y'all to debug the contents of the pod, possibly making changes to fix issues, while yet having access to all the context of the pod (such as volumes)

Using pause and dcos task exec is a very powerful debugging tool. To interruption a pod utilise the CLI control:

          dcos cassandra --proper noun=cassandra debug pod break <pod-name>                  

To pause a specific task of a pod suspend the -t <chore-name> flag,

          dcos cassandra --name=cassandra debug pod interruption <pod-proper noun> -t <chore-name>                  

Use the pod status command to check what tasks and pods are in an overridden state.

Resume

To resume a paused pod or task use the CLI command:

          dcos cassandra --proper name=cassandra debug pod resume <pod-proper name> [-t <task-name>]                  

Cheque the pod status control to verify that the pod (or task) has properly resumed.

Service Metrics

The DC/OS Apache Cassandra service pushes metrics into the DC/OS metrics arrangement. Details on consuming them can be found at the documentation on the DC/OS metrics system.

Logging

DC/Os has three ways to access service scheduler and service chore logs.

  1. Via the DC/Bone GUI
  2. Via the Mesos GUI
  3. Via the DC/OS CLI

DC/Bone GUI

A service's logs are accessed by selecting the service in the Services tab. Both the service scheduler and service tasks are displayed side past side in this view. To view the tasks sandbox (files) as well as its stdout and stderr, click on a task.

Mesos GUI

The Mesos GUI provides similar access as that of the DC/Bone UI, just the service scheduler is divide from the service tasks. The service tasks can all be found in the Frameworks tab under the framework with the same name equally the service. The service scheduler can be found in the Marathon framework, information technology will be a task with the same name every bit the service.

Access both the files and logs of a service by clicking on its sandbox link.

contents of a scheduler sandbox

Figure 1 - Scheduler sandbox

DC/Os CLI

The dcos task log subcommand allows you lot to pull logs from multiple tasks at the same time. The dcos task log <task-pattern> command will fetch logs for all tasks matching the prefix pattern. See the help of the subcommand for total details of the command.

Mesos Amanuensis logs

Occasionally, it can too be useful to examine what a given Mesos agent is doing. The Mesos Agent handles deployment of Mesos tasks to a given physical system in the cluster. 1 Mesos Agent runs on each system. These logs tin can exist useful for determining if there is a problem at the system level that is causing alerts across multiple services on that system.

Mesos amanuensis logs can be accessed via the Mesos GUI or directly on the agent. The GUI method is described here.

Navigate to the agent you want to view either directly from a task past clicking the "Agent" item in the breadcrumb when viewing a task (this volition go directly to the agent hosting the job), or by navigating through the "Agents" menu item at the peak of the screen (you will demand to select the desired agent from the list).

In the Agent view, y'all volition run into a list of frameworks with a presence on that Agent. In the left pane you will run into a plain link named "LOG". Click that link to view the agent logs.

view of tasks running on a given agent

Effigy 2 - List of frameworks

Performing Cassandra Cleanup and Repair Operations

You may manually trigger sure nodetool operations against your Cassandra instance using the CLI or the HTTP API.

Cleanup

You may trigger a nodetool cleanup operation beyond your Cassandra nodes using the cleanup plan. This plan requires the post-obit parameters to run:

  • CASSANDRA_KEYSPACE: the Cassandra keyspace to be cleaned upwardly.

To initiate this plan from the command line:

          dcos cassandra --name=<service-name> plan start cleanup -p CASSANDRA_KEYSPACE=space1                  

To view the status of this plan from the command line:

          dcos cassandra --proper noun=<service-name> plan status cleanup cleanup (IN_PROGRESS) └─ cleanup-deploy (IN_PROGRESS)    ├─ node-0:[cleanup] (Complete)    ├─ node-1:[cleanup] (STARTING)    └─ node-two:[cleanup] (PENDING)                  

When the programme is completed, its status will be Consummate.

The higher up program showtime and plan status commands may also be fabricated straight to the service over HTTP. To meet the queries involved, run the above commands with an additional -v flag.

For more than data well-nigh nodetool cleanup, see the Cassandra documentation.

Repair

You may trigger a nodetool repair operation across your Cassandra nodes using the repair plan. This program requires the following parameters to run:

  • CASSANDRA_KEYSPACE: the Cassandra keyspace to be cleaned up.

To initiate this control from the command line:

          dcos cassandra --name=<service-proper noun> plan beginning repair -p CASSANDRA_KEYSPACE=space1                  

To view the status of this programme from the command line:

          dcos cassandra --name=<service-name> plan condition repair repair (STARTING) └─ repair-deploy (STARTING)    ├─ node-0:[repair] (STARTING)    ├─ node-one:[repair] (PENDING)    └─ node-ii:[repair] (PENDING)                  

When the program is completed, its status will be Complete.

The above program start and programme status commands may also be fabricated direct to the service over HTTP. To see the queries involved, run the above commands with an additional -5 flag.

For more information about nodetool repair, run into the Cassandra documentation.

Seed nodes

Cassandra seed nodes are those nodes with indices smaller than the seed node count. By default, Cassandra is deployed with a seed node count of ii (node-0 and node-1 are seed nodes). When a supervene upon performance is performed on ane these nodes, all other nodes must be restarted to be brought up to date regarding the ip address of the new seed node. This operation is performed automatically.

For case if node-0 needed to be replaced you would execute:

          dcos cassandra --proper name=<service-name> pod supplant node-0                  

which would result in a recovery plan like the post-obit:

          $ dcos cassandra --proper noun=<service-proper noun> programme testify recovery recovery (IN_PROGRESS) └─ permanent-node-failure-recovery (IN_PROGRESS)    ├─ node-0:[server] (Consummate)    ├─ node-1:[server] (STARTING)    └─ node-2:[server] (PENDING)    ...                  

Note: But the seed node is being placed on a new node, all other nodes are restarted in identify with no loss of data.

Support and Restore

Bankroll Upwardly to S3

You tin back up an entire cluster's data and schema to Amazon S3 using the backup-s3 plan. This programme requires the post-obit parameters to run:

  • SNAPSHOT_NAME: the proper name of this snapshot. Snapshots for individual nodes volition exist stored as S3 folders inside of a superlative level snapshot folder.
  • CASSANDRA_KEYSPACES: the Cassandra keyspaces to back up. The entire keyspace, as well as its schema, will be backed up for each keyspace specified.
  • AWS_ACCESS_KEY_ID: the admission fundamental ID for the AWS IAM user running this backup
  • AWS_SECRET_ACCESS_KEY: the secret access central for the AWS IAM user running this backup
  • AWS_REGION: the region of the S3 bucket beingness used to store this backup
  • AWS_SESSION_TOKEN: needed if you're including temporary security credentials in the file
  • S3_BUCKET_NAME: the proper noun of the S3 bucket in which to store this fill-in
  • HTTPS_PROXY: specifications for the backup program, taken from config.yaml.

Optional parameters:

  • AWS_SESSION_ID: It may as well be necessary to set the session ID depending on how you authenticate with AWS.
  • AWS_SESSION_TOKEN: It may likewise be necessary to set up the session token depending on how you authenticate with AWS.

Make sure that you provision your nodes with enough disk space to perform a backup. backups are stored on deejay before being uploaded to S3, and will take upwards as much space equally the information currently in the tables, so you will demand half of your total bachelor infinite to be free to support every keyspace at in one case.

As noted in the documentation for the fill-in/restore strategy configuration option, it is possible to run transfers to S3 either in series or in parallel, but intendance must exist taken non to exceed any throughput limits you may have in your cluster. Throughput depends on a multifariousness of factors, including uplink speed, proximity to region where the backups are beingness uploaded and downloaded, and the performance of the underlying storage infrastructure. Y'all should perform periodic tests in your local surround to understand what you can expect from S3.

Yous can configure whether snapshots are created and uploaded in serial (default) or in parallel. The serial fill-in/restore strategy is recommended.

You can initiate this plan from the command line:

          SNAPSHOT_NAME=<my_snapshot> CASSANDRA_KEYSPACES="space1 space2" AWS_ACCESS_KEY_ID=<my_access_key_id> AWS_SECRET_ACCESS_KEY=<my_secret_access_key> AWS_REGION=the states-west-2 AWS_SESSION_TOKEN=AQoDYXdzEJr...<remainder of security token> S3_BUCKET_NAME=backups dcos cassandra --name=<service-proper noun> program outset backup-s3 \     -p SNAPSHOT_NAME=$SNAPSHOT_NAME \     -p "CASSANDRA_KEYSPACES=$CASSANDRA_KEYSPACES" \     -p AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \     -p AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \     -p AWS_REGION=$AWS_REGION \     -p S3_BUCKET_NAME=$S3_BUCKET_NAME     -p HTTPS_PROXY=http://internal.proxy:8080                  

If you are backing up multiple keyspaces, they must exist separated past spaces and wrapped in quotation marks when supplied to the plan showtime command, as in the example above. If the CASSANDRA_KEYSPACES parameter isn't supplied, and so every keyspace in your cluster will be backed up.

WARNING: To ensure that sensitive data such as your AWS secret access key remains secure, make sure that yous've ready the cadre.dcos_url configuration property in the DC/Os CLI to an HTTPS URL.

To view the condition of this program from the control line:

          dcos cassandra --name=<service-name> plan status backup-s3 backup-s3 (IN_PROGRESS) ├─ backup-schema (COMPLETE) │  ├─ node-0:[backup-schema] (COMPLETE) │  ├─ node-ane:[backup-schema] (Complete) │  └─ node-ii:[fill-in-schema] (Complete) ├─ create-snapshots (IN_PROGRESS) │  ├─ node-0:[snapshot] (STARTED) │  ├─ node-1:[snapshot] (STARTED) │  └─ node-two:[snapshot] (Consummate) ├─ upload-backups (Awaiting) │  ├─ node-0:[upload-s3] (PENDING) │  ├─ node-ane:[upload-s3] (PENDING) │  └─ node-2:[upload-s3] (PENDING) └─ cleanup-snapshots (PENDING)    ├─ node-0:[cleanup-snapshot] (PENDING)    ├─ node-1:[cleanup-snapshot] (Awaiting)    └─ node-2:[cleanup-snapshot] (PENDING)                  

The above plan starting time and plan condition commands may also be fabricated direct to the service over HTTP. To see the queries involved, run the above commands with an additional -v flag.

Backing up to Azure

You can besides support to Microsoft Azure using the backup-azure program. This plan requires the following parameters to run:

  • SNAPSHOT_NAME: the proper noun of this snapshot. Snapshots for private nodes will be stored as gzipped tarballs with the name node-<POD_INDEX>.tar.gz.
  • CASSANDRA_KEYSPACES: the Cassandra keyspaces to backup. The entire keyspace, as well as its schema, will be backed up for each keyspace specified.
  • CLIENT_ID: the customer ID for the Azure service primary running this backup
  • TENANT_ID: the tenant ID for the tenant that the service master belongs to
  • CLIENT_SECRET: the service main's secret key
  • AZURE_STORAGE_ACCOUNT: the name of the storage account that this backup will be sent to
  • AZURE_STORAGE_KEY: the secret cardinal associated with the storage business relationship
  • CONTAINER_NAME: the name of the container in which to store this backup.

You lot can initiate this plan from the command line in the aforementioned fashion equally the Amazon S3 fill-in plan:

          dcos cassandra --proper name=<service-proper noun> plan offset backup-azure \     -p SNAPSHOT_NAME=$SNAPSHOT_NAME \     -p "CASSANDRA_KEYSPACES=$CASSANDRA_KEYSPACES" \     -p CLIENT_ID=$CLIENT_ID \     -p TENANT_ID=$TENANT_ID \     -p CLIENT_SECRET=$CLIENT_SECRET \     -p AZURE_STORAGE_ACCOUNT=$AZURE_STORAGE_ACCOUNT \     -p AZURE_STORAGE_KEY=$AZURE_STORAGE_KEY \     -p CONTAINER_NAME=$CONTAINER_NAME                  

To view the status of this plan from the control line:

          dcos cassandra --name=<service-proper noun> plan status backup-azure backup-azure (IN_PROGRESS) ├─ backup-schema (COMPLETE) │  ├─ node-0:[backup-schema] (COMPLETE) │  ├─ node-1:[backup-schema] (COMPLETE) │  └─ node-2:[backup-schema] (COMPLETE) ├─ create-snapshots (COMPLETE) │  ├─ node-0:[snapshot] (Complete) │  ├─ node-1:[snapshot] (Consummate) │  └─ node-two:[snapshot] (Consummate) ├─ upload-backups (IN_PROGRESS) │  ├─ node-0:[upload-azure] (COMPLETE) │  ├─ node-ane:[upload-azure] (STARTING) │  └─ node-ii:[upload-azure] (PENDING) └─ cleanup-snapshots (Pending)    ├─ node-0:[cleanup-snapshot] (PENDING)    ├─ node-1:[cleanup-snapshot] (PENDING)    └─ node-ii:[cleanup-snapshot] (PENDING)                  

The above plan start and plan status commands may too be fabricated directly to the service over HTTP. To see the queries involved, run the above commands with an additional -v flag.

Restore

All restore plans will restore the schema from every keyspace backed upwards with the backup plan and populate those keyspaces with the data they contained at the time the snapshot was taken. Downloading and restoration of backups will utilise the configured fill-in/restore strategy. This plan assumes that the keyspaces being restored do not already exist in the electric current cluster, and volition fail if any keyspace with the aforementioned name is present.

Restoring From S3

Restoring cluster data is similar to backing it up. The restore-s3 plan assumes that your data is stored in an S3 bucket in the format that fill-in-s3 uses. The restore plan has the following required parameters:

  • SNAPSHOT_NAME: the snapshot name from the fill-in-s3 plan
  • AWS_ACCESS_KEY_ID: the admission fundamental ID for the AWS IAM user running this restore
  • AWS_SECRET_ACCESS_KEY: the secret admission key for the AWS IAM user running this restore
  • AWS_REGION: the region of the S3 bucket existence used to store the backup being restored
  • S3_BUCKET_NAME: the name of the S3 bucket where the backup is stored

Optional parameters:

  • AWS_SESSION_ID: It may also exist necessary to gear up the session ID depending on how you authenticate with AWS.
  • AWS_SESSION_TOKEN: It may as well be necessary to ready the session token depending on how you authenticate with AWS.

To initiate this plan from the command line:

          SNAPSHOT_NAME=<my_snapshot> CASSANDRA_KEYSPACES="space1 space2" AWS_ACCESS_KEY_ID=<my_access_key_id> AWS_SECRET_ACCESS_KEY=<my_secret_access_key> AWS_REGION=the states-west-ii S3_BUCKET_NAME=backups dcos cassandra --name=<service-name> programme get-go restore-s3 \     -p SNAPSHOT_NAME=$SNAPSHOT_NAME \     -p "CASSANDRA_KEYSPACES=$CASSANDRA_KEYSPACES" \     -p AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \     -p AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \     -p AWS_REGION=$AWS_REGION \     -p S3_BUCKET_NAME=$S3_BUCKET_NAME                  

To view the status of this plan from the command line:

          dcos cassandra --name=<service-name> plan condition restore-s3 restore-s3 (IN_PROGRESS) ├─ fetch-s3 (COMPLETE) │  ├─ node-0:[fetch-s3] (Consummate) │  ├─ node-ane:[fetch-s3] (COMPLETE) │  └─ node-ii:[fetch-s3] (COMPLETE) ├─ restore-schema (IN_PROGRESS) │  ├─ node-0:[restore-schema] (Complete) │  ├─ node-1:[restore-schema] (STARTED) │  └─ node-2:[restore-schema] (PENDING) └─ restore-snapshots (Awaiting)    ├─ node-0:[restore-snapshot] (PENDING)    ├─ node-i:[restore-snapshot] (Pending)    └─ node-2:[restore-snapshot] (Pending)                  

The higher up program start and program condition commands may also be made directly to the service over HTTP. To see the queries involved, run the to a higher place commands with an additional -v flag.

Restoring From Azure

Y'all can restore from Microsoft Azure using the restore-azure plan. This plan requires the post-obit parameters to run:

  • SNAPSHOT_NAME: the name of this snapshot. Snapshots for private nodes will be stored as gzipped tarballs with the proper noun node-<POD_INDEX>.tar.gz.
  • CLIENT_ID: the client ID for the Azure service principal running this backup
  • TENANT_ID: the tenant ID for the tenant that the service principal belongs to
  • CLIENT_SECRET: the service chief's secret key
  • AZURE_STORAGE_ACCOUNT: the name of the storage business relationship that this fill-in will exist sent to
  • AZURE_STORAGE_KEY: the surreptitious key associated with the storage account
  • CONTAINER_NAME: the name of the container in whcih to shop this backup

You can initiate this plan from the command line in the same way as the Amazon S3 restore programme:

          dcos cassandra --proper name=<service-name> plan outset restore-azure \     -p SNAPSHOT_NAME=$SNAPSHOT_NAME \     -p CLIENT_ID=$CLIENT_ID \     -p TENANT_ID=$TENANT_ID \     -p CLIENT_SECRET=$CLIENT_SECRET \     -p AZURE_STORAGE_ACCOUNT=$AZURE_STORAGE_ACCOUNT \     -p AZURE_STORAGE_KEY=$AZURE_STORAGE_KEY \     -p CONTAINER_NAME=$CONTAINER_NAME                  

To view the condition of this plan from the control line:

          dcos cassandra --name=<service-proper name> plan status restore-azure restore-azure (IN_PROGRESS) ├─ fetch-azure (COMPLETE) │  ├─ node-0:[fetch-azure] (Consummate) │  ├─ node-1:[fetch-azure] (Complete) │  └─ node-two:[fetch-azure] (COMPLETE) ├─ restore-schema (Complete) │  ├─ node-0:[restore-schema] (COMPLETE) │  ├─ node-i:[restore-schema] (Complete) │  └─ node-ii:[restore-schema] (COMPLETE) └─ restore-snapshots (IN_PROGRESS)    ├─ node-0:[restore-snapshot] (Complete)    ├─ node-one:[restore-snapshot] (STARTING)    └─ node-two:[restore-snapshot] (Pending)                  

The above plan start and plan status commands may besides exist fabricated directly to the service over HTTP. To come across the queries involved, run the above commands with an additional -v flag.

Source: https://docs.d2iq.com/mesosphere/dcos/services/cassandra/2.4.0-3.0.16/operations

Posted by: chavezwhichisatur.blogspot.com

0 Response to "Can I Shutdown Cassandra Node During Repair"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel