Skip to content

Commit af819c5

Browse files
newkekolim7t
authored andcommitted
Add Astyanax upgrade guide (apache#732)
1 parent a0e2afd commit af819c5

6 files changed

Lines changed: 448 additions & 2 deletions

File tree

manual/object_mapper/using/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -146,13 +146,13 @@ the client a chance to customize the statement before executing it.
146146
- `Mapper.getQuery(userId)`: returns a statement to select a row in the
147147
database, selected on the given `userId`, and matching the mapped
148148
object structure.
149-
- `Mapper.deleteQuery(userID)`: returns a statement delete a row in the
149+
- `Mapper.deleteQuery(userID)`: returns a statement to delete a row in the
150150
database given the `userId` provided. This method can also accept a
151151
mapped object instance.
152152

153153
#### Manual mapping
154154

155-
`Mapper#map` provides a way to converts the results of a regular query:
155+
`Mapper#map` provides a way to convert the results of a regular query:
156156

157157
```java
158158
ResultSet results = session.execute("SELECT * FROM user");
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
language_level_changes
2+
configuration
3+
queries_and_results
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Migrating from Astyanax
2+
3+
This section is a guide for users previously using *Astyanax* and looking for
4+
migrating to the *DataStax Java driver*.
5+
6+
See the child pages for more information:
7+
8+
* [Changes at the language level](language_level_changes/)
9+
* [Migrating Astyanax configurations to DataStax Java driver configurations](configuration/)
10+
* [Querying and retrieving results comparisons.](queries_and_results/)
Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
# Configuration
2+
3+
## How Configuring the Java driver works
4+
5+
The two basic components in the Java driver are the `Cluster` and the `Session`.
6+
The `Cluster` is the object to create first, and on to which all global configurations
7+
apply. Connecting to the `Cluster` creates a `Session`. Queries are executed
8+
through the `Session`.
9+
10+
The `Cluster` object then is to be viewed as the equivalent of the `AstyanaxContext`
11+
object. "Starting" an `AstyanaxContext` object typically returns a `Keyspace`
12+
object, the `Keyspace` object is the equivalent of the *Java driver*’s `Session`.
13+
14+
Configuring a `Cluster` works with the *Builder* pattern. The `Builder` takes all
15+
the configurations into account before building the `Cluster`.
16+
17+
Following are some examples of the most important configurations that were
18+
possible with *Astyanax* and how to translate them into *DataStax Java driver*
19+
configurations. Please note that the Java driver has been optimized to handle most use
20+
cases at best and even though the following sections show how to tune some various
21+
options, the driver should provide the best performances with the default configurations
22+
and these options should not be changed unless there is a good reason to.
23+
24+
## Connection pools
25+
26+
Configuration of connection pools in *Astyanax* are made through the
27+
`ConnectionPoolConfigurationImpl`. This object gathers important configurations
28+
that the *Java driver* has categorized in multiple *Option* and *Policy* kinds.
29+
30+
### Connections pools internals
31+
Everything concerning the internal pools of connections to the *Cassandra nodes*
32+
will be gathered in the Java driver in the [`PoolingOptions`](../../../manual/pooling):
33+
34+
*Astyanax*:
35+
36+
```java
37+
ConnectionPoolConfigurationImpl cpool =
38+
new ConnectionPoolConfigurationImpl("myConnectionPool")
39+
.setInitConnsPerHost(2)
40+
.setMaxConnsPerHost(3)
41+
```
42+
43+
*Java driver*:
44+
45+
```java
46+
PoolingOptions poolingOptions =
47+
new PoolingOptions()
48+
.setConnectionsPerHost(HostDistance.LOCAL, 2, 3)
49+
```
50+
The first number is the initial number of connections, the second is the maximum number
51+
of connections the driver is allowed to create for each host.
52+
53+
Note that the *Java driver* allows multiple simultaneous requests on one single
54+
connection, as it is based upon the [*Native protocol*](../../../manual/native_protocol),
55+
an asynchronous binary protocol that can handle up to 32768 simultaneous requests on a
56+
single connection. The Java driver is able to manage and distribute simultaneous requests
57+
by itself even under high contention, and changing the default `PoolingOptions` is not
58+
necessary most of the time except for very [specific use cases](../../../manual/pooling/#tuning-protocol-v3-for-very-high-throughputs).
59+
60+
### Timeouts
61+
62+
Timeouts concerning requests, or connections will be part of the `SocketOptions`.
63+
64+
*Astyanax*:
65+
66+
```java
67+
ConnectionPoolConfigurationImpl cpool =
68+
new ConnectionPoolConfigurationImpl("myConnectionPool")
69+
.setSocketTimeout(3000)
70+
.setConnectTimeout(3000)
71+
```
72+
73+
*Java Driver:*
74+
75+
```java
76+
SocketOptions so =
77+
new SocketOptions()
78+
.setReadTimeoutMillis(3000)
79+
.setConnectTimeoutMillis(3000);
80+
```
81+
82+
Changing the client timeout options might have more impacts than expected, **please make
83+
sure to properly document before changing these options.**
84+
85+
## Load Balancing
86+
Both *Astyanax* and the *Java driver* connect to multiple nodes of the *Cassandra*
87+
cluster. Distributing requests through all the nodes plays an important role in
88+
the good operation of the `Cluster` and for best performances. With *Astyanax*,
89+
requests (or “operations”) can be sent directly to replicas that have a copy of
90+
the data targeted by the *“Row key”* specified in the operation. Since the *Thrift* API is
91+
low-level, it forces the user to provide *Row keys*, known as the `TokenAware`
92+
connection pool type. This setting is also present in the *Java driver*, however
93+
the configuration is different and provides more options to tweak.
94+
95+
Load balancing in the *Java driver* is a *Policy*, it is a class that will be
96+
plugged in the *Java driver*’s code and the Driver will call its methods when it
97+
needs to. The *Java driver* comes with a preset of specific load balancing policies.
98+
Here’s an equivalent code:
99+
100+
*Astyanax*:
101+
102+
```java
103+
final ConnectionPoolType poolType = ConnectionPoolType.TOKEN_AWARE;
104+
final NodeDiscoveryType discType = NodeDiscoveryType.RING_DESCRIBE;
105+
ConnectionPoolConfigurationImpl cpool =
106+
new ConnectionPoolConfigurationImpl("myConnectionPool")
107+
.setLocalDatacenter("myDC")
108+
AstyanaxConfigurationImpl aconf =
109+
new AstyanaxConfigurationImpl()
110+
.setConnectionPoolType(poolType)
111+
.setDiscoveryType(discType)
112+
```
113+
114+
*Java driver*:
115+
116+
```java
117+
LoadBalancingPolicy lbp = new TokenAwarePolicy(
118+
DCAwareRoundRobinPolicy.builder()
119+
.withLocalDc("myDC")
120+
.build()
121+
);
122+
```
123+
124+
*By default* the *Java driver* will instantiate the exact Load balancing policy
125+
shown above, with the `LocalDC` being the DC of the first host the driver connects
126+
to. So to get the same behaviour than the *TokenAware* pool type of *Astyanax*,
127+
users shouldn’t need to specify a load balancing policy since the default one
128+
should cover it.
129+
130+
Important: Note that since *CQL* is an abstraction of the Cassandra’s architecture, a simple
131+
query needs to have the *Row key* specified explicitly on a `Statement` in order
132+
to benefit from the *TokenAware* routing (the *Row key* in the *Java driver* is
133+
referenced as *Routing Key*), unlike the *Astyanax* driver.
134+
Some differences occur related to the different kinds of `Statements` the *Java
135+
driver* provides. Please see [this link](../../../manual/load_balancing/#token-aware-policy)
136+
for specific information.
137+
138+
Custom load balancing policies can easily be implemented by users, and supplied to
139+
the *Java driver* for specific use cases. All information necessary is available
140+
in the [Load balaning policies docs](../../../manual/load_balancing).
141+
142+
## Consistency levels
143+
Consistency levels can be set per-statement, or globally through the `QueryOptions`.
144+
145+
*Astyanax*:
146+
147+
```java
148+
AstyanaxConfigurationImpl aconf =
149+
new AstyanaxConfigurationImpl()
150+
.setDefaultReadConsistencyLevel(ConsistencyLevel.CL*ALL)
151+
.setDefaultWriteConsistencyLevel(ConsistencyLevel.CL*ALL)
152+
```
153+
154+
*Java driver*:
155+
156+
```java
157+
QueryOptions qo = new QueryOptions().setConsistencyLevel(ConsistencyLevel.ALL);
158+
```
159+
160+
Since the *Java driver* only executes *CQL* statements, which can be either reads
161+
or writes to *Cassandra*, it is not possible to globally configure the
162+
Consistency Level for only reads or only writes. To do so, since the Consistency
163+
Level can be set per-statement, you can either set it on every statement, or use
164+
`PreparedStatements` (if queries are to be repeated with different values): in
165+
this case, setting the CL on the `PreparedStatement`, causes the `BoundStatements` to
166+
inherit the CL from the prepared statements they were prepared from. More
167+
informations about how `Statement`s work in the *Java driver* are detailed
168+
in the [“Queries and Results” section](../queries_and_results/).
169+
170+
171+
## Authentication
172+
173+
Authentication settings are managed by the `AuthProvider` class in the *Java driver*.
174+
It can be highly customizable, but also comes with default simple implementations:
175+
176+
*Astyanax*:
177+
178+
```java
179+
AuthenticationCredentials authCreds = new SimpleAuthenticationCredentials("username", "password");
180+
ConnectionPoolConfigurationImpl cpool =
181+
new ConnectionPoolConfigurationImpl("myConnectionPool")
182+
.setAuthenticationCredentials(authCreds)
183+
```
184+
185+
*Java driver*:
186+
187+
```java
188+
AuthProvider authProvider = new PlainTextAuthProvider("username", "password");
189+
```
190+
191+
The class `AuthProvider` can be easily implemented to suit the user’s needs,
192+
documentation about the classes needed is [available there](../../../manual/auth/).
193+
194+
## Hosts and ports
195+
196+
Setting the “seeds” or first hosts to connect to can be done directly on the
197+
Cluster configuration Builder:
198+
199+
*Astyanax*:
200+
201+
```java
202+
ConnectionPoolConfigurationImpl cpool =
203+
new ConnectionPoolConfigurationImpl("myConnectionPool")
204+
.setSeeds("127.0.0.1")
205+
.setPort(9160)
206+
```
207+
208+
*Java driver*:
209+
210+
```java
211+
Cluster cluster = Cluster.builder()
212+
.addContactPoint("127.0.0.1")
213+
.withPort(9042)
214+
```
215+
216+
The *Java driver* by default connects to port *9042*, hence you can supply only
217+
host names with the `addContactPoints(String...)` method. Note that the contact
218+
points are only the entry points to the `Cluster` for the *Automatic discovery
219+
phase*.
220+
221+
## Building the Cluster
222+
With all options previously presented, one may configure and create the
223+
`Cluster` object this way:
224+
225+
*Java driver*:
226+
227+
```java
228+
Cluster cluster = Cluster.builder()
229+
.addContactPoint("127.0.0.1")
230+
.withAuthProvider(authProvider)
231+
.withLoadBalancingPolicy(lbp)
232+
.withSocketOptions(so)
233+
.withPoolingOptions(poolingOptions)
234+
.withQueryOptions(qo)
235+
.build();
236+
Session session = cluster.connect();
237+
```
238+
239+
## Best Practices
240+
241+
A few best practices are summed up in [this blog post](http://www.datastax.com/dev/blog/4-simple-rules-when-using-the-datastax-drivers-for-cassandra).
242+
243+
Concerning connection pools, the Java driver’s default settings should allow
244+
most of the users to get the best out of the driver in terms of throughput,
245+
they have been thoroughly tested and tweaked to accommodate the users’ needs.
246+
If one still wishes to change those, first [Monitoring the pools](../../../manual/pooling/#monitoring-and-tuning-the-pool) is
247+
advised, then a [deep dive in the Pools management mechanism](../../../manual/pooling/) should
248+
provide enough insight.
249+
250+
A lot more options are available in the different `XxxxOption`s classes, policies are
251+
also highly customizable since the base Java driver's implementations can easily be
252+
extended and implement user-specific actions.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Language change : from Thrift to CQL
2+
The data model changes when using *CQL* (Cassandra Query Language).
3+
*CQL* is providing an abstraction of the low-level data stored in *Cassandra*, in
4+
opposition to *Thrift* that aims to expose the low-level data structure directly.
5+
[But note that this changes with Cassandra 3’s new storage engine.](http://www.datastax.com/2015/12/storage-engine-30)
6+
7+
*Thrift* exposes *Keyspaces*, and these *Keyspaces* contain *Column Families*. A
8+
*ColumnFamily* contains *Rows* in which each *Row* has a list of an arbitrary number
9+
of column-values. With *CQL*, the data is **tabular**, *ColumnFamily* gets viewed
10+
as a *Table*, the **Table Rows** get a **fixed and finite number of named columns**.
11+
*Thrift*’s columns inside the *Rows* get distributed in a tabular way through the
12+
_Table Rows_. See the following figure:
13+
14+
```ditaa
15+
Thrift
16+
/- -\
17+
| |
18+
| /------------\ /---------------+---------------+---------------+---------+ |
19+
| | cRED | |cFA0 1 | 2 | 3 | | |
20+
| | 1 | ----------> +---------------+---------------+---------------+ ... | +--> One Thrift
21+
| | | |c1AB 'a' | 'b' | 'c' | | | ROW
22+
| \------------/ \---------------+---------------+---------------+---------+ |
23+
| |
24+
One Thrift | -/
25+
COLUMNFAMILY |
26+
|
27+
| /------------\ /---------------+---------------+---------+
28+
| | | | 1 | 2 | |
29+
| | 2 | ----------> +---------------+---------------+ ... |
30+
| | | | 'a' | 'b' | |
31+
| \------------/ \---------------+---------------+---------+
32+
|
33+
\-
34+
35+
36+
-----------------------------------------------------------------------
37+
38+
39+
CQL
40+
41+
/-
42+
|
43+
| /--------------------+---------------------------------+-----------------------------\
44+
| | key | column1 | value |
45+
| +--------------------+---------------------------------+-----------------------------+
46+
| | cRED 1 | cFA0 1 | c1AB 'a' |
47+
| +--------------------+---------------------------------+-----------------------------+ -\
48+
| | cRED 1 | 2 | 'b' | +--> One CQL
49+
One CQL | +--------------------+---------------------------------+-----------------------------+ -/ ROW
50+
TABLE | | cRED 1 | 3 | 'c' |
51+
| +--------------------+---------------------------------+-----------------------------+
52+
| | cRED ... | ... | ... |
53+
| +--------------------+---------------------------------+-----------------------------+
54+
| | 2 | 1 | 'a' |
55+
| +--------------------+---------------------------------+-----------------------------+
56+
| | 2 | 2 | 'b' |
57+
| +--------------------+---------------------------------+-----------------------------+
58+
| | ... | ... | ... |
59+
| +--------------------+---------------------------------+-----------------------------+
60+
\-
61+
```
62+
63+
Some of the columns of a *CQL Table* have a special role that is specifically
64+
related to the *Cassandra* architecture. Indeed, the *Row key* of the *Thrift Row*,
65+
becomes the *Partition Key* in the *CQL Table*, and can be composed of 1 or multiple
66+
*CQL columns* (the key column in Figure 1). The *“Column”* part of the Column-value
67+
component in a *Thrift Row*, becomes the *Clustering Column* in *CQL*, and can
68+
also be composed of multiple columns (in the figure, column1 is the only column
69+
composing the *Clustering Column*, but there can be others if the Thrift's ColumnComparator
70+
is a CompositeType).
71+
72+
Here is the basic architectural concept of *CQL*, a detailed explanation and *CQL*
73+
examples can be found in this article : [http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/](http://www.planetcassandra.org/making-the-change-from-thrift-to-cql/).
74+
Understanding the *CQL* abstraction plays a key role in developing performing
75+
and scaling applications.

0 commit comments

Comments
 (0)