Friday, August 29, 2014

MySQL - Character Sets and Collations



A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.
Suppose that we have an alphabet with four letters: 'A', 'B', 'a', 'b'. We give each letter a number: 'A' = 0, 'B' = 1, 'a' = 2, 'c' = 3. The letter 'A' is a symbol, the number 0 is the encoding for 'A', and the combination of all four letters and their encodings is a character set.
Now, suppose that we want to compare two string values, 'A' and 'B'. The simplest way to do this is to look at the encodings: 0 for 'A' and 1 for 'B'. Because 0 is less than 1, we say 'A' is less than 'B'. Now, what we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in this case): "compare the encodings." We call this simplest of all possible collations a binary collation.
But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters 'a' and 'b' as equivalent to 'A' and 'B'; (2) then compare the encodings. We call this a case-insensitive collation. It's a little more complex than a binary collation.

Using this example, you can change character set and collation for a MySQL database table(s).
Most likely you will be need to do this if you haven’t specified character set and collation at the time of database/table creation and default character set/collation applied are not desirable.

------------------------------------------------------------------------------------------------------------------

Setting MySQL default character set and collation in my.cnf

Below are settings for MySQL version 5.5.9 and onwards.
Put them in /etc/mysql/my.cnf is correct sections. Please be careful as some settings might be already present.
[mysqld]
character-set-server=utf8 
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
init_connect='SET collation_connection = utf8_unicode_ci' 
skip-character-set-client-handshake
 
Next, restart mysql and log into mysql shell:

mysql> show variables like "%character%";show variables like "%collation%";

Sample output as:
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)

Checking current character set and collation for database/table/columns

For Database:

SELECT default_character_set_name, default_collation_name FROM information_schema.SCHEMATA  
WHERE schema_name = "databasename";
 
It will show output as:
+----------------------------+------------------------+
| default_character_set_name | default_collation_name |
+----------------------------+------------------------+
| latin1                     | latin1_swedish_ci      |
+----------------------------+------------------------+

For Tables:

SELECT T.table_name, T.table_collation, CCSA.character_set_name FROM information_schema.`TABLES` T,
information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA
WHERE CCSA.collation_name = T.table_collation
AND T.table_schema = "databasename";
 
Sample output as below:
+-----------------------------------------------------+-------------------+--------------------+
| table_name                                          | table_collation   | character_set_name |
+-----------------------------------------------------+-------------------+--------------------+
| rtc_wp_rtAccountToken                               | latin1_swedish_ci | latin1             |
| rtc_wp_rtAccountVerify                              | latin1_swedish_ci | latin1             |
| rtc_wp_rt_crm_mail_messageids                       | latin1_swedish_ci | latin1             |
| rtc_wp_w3tc_cdn_queue                               | latin1_swedish_ci | latin1             |
| gp_meta                                             | utf8_general_ci   | utf8               |
+-----------------------------------------------------+-------------------+--------------------+

For Columns:

SELECT table_name, column_name, character_set_name, collation_name FROM information_schema.`COLUMNS` C 
WHERE character_set_name != 'NULL' AND table_schema = "db_name"
 
Sample Output:
+------------------------+--------------+--------------------+-------------------+
| table_name             | column_name  | character_set_name | collation_name    |
+------------------------+--------------+--------------------+-------------------+
| rtc_wp_rtAccountToken  | accesstoken  | latin1             | latin1_swedish_ci |
| rtc_wp_rtAccountToken  | refreshtoken | latin1             | latin1_swedish_ci |
| rtc_wp_rtAccountVerify | email        | latin1             | latin1_swedish_ci |
| rtc_wp_rtAccountVerify | type         | latin1             | latin1_swedish_ci |
| rtc_wp_rtAccountVerify | code         | latin1             | latin1_swedish_ci |
+------------------------+--------------+--------------------+-------------------+

Converting character set and collations

MAKE BACKUP

We are serious. Just use mysqldump rather than regretting it later

Changing Database Character Sets and Collations

This is simplest:
ALTER DATABASE db_name CHARACTER SET utf8 COLLATE utf8_general_ci;
Replace your database name with db_name. Also after running query verify if database-level defaults are changed indeed.

Changing Tables Character Sets and Collations

Below is a syntax to covert character set ofwp_posts and wp_postmetatables.
alter table wp_posts convert to character set utf8 collate utf8_unicode_ci;
alter table wp_postmeta convert to character set utf8 collate utf8_unicode_ci;
If you want to covert all your MySQL tables, then run a command like below on database db_wordpress
mysql -e "SELECT concat('alter table ', TABLE_NAME , ' convert to character set utf8 collate utf8_unicode_ci;')
FROM information_schema.TABLES
WHERE table_schema = 'db_wordpress'
AND TABLE_COLLATION = 'latin1_swedish_ci'" |
tail -n+2 > collation.sql
After you run above query, check collation.sql content to verify if all rows are correct. If collation.sql is empty, you probably do not have a table using MyISAM engine.
If all looks good, run following to convert all mysql tables to InnoDB.
mysql db_wordpress < collation.sql

Changing Column Character Sets and Collations

Below is syntax to convert columns to utf8
alter table table_name change col_name col_name col_data_type character set utf8;
Please note that we have to use same col_name twice!
col_data_type can be found form a sql query like…
mysql> SELECT table_name, column_name, data_type, character_set_name, collation_name FROM information_schema.`COLUMNS` WHERE  table_schema = "db_name" AND table_name = "table_name" AND column_name = "col_name";
Sample output:
+--------------+--------------+-----------+
| table_name   | column_name  | data_type |
+--------------+--------------+-----------+
| wp_posts     | post_content | longtext  |
+--------------+--------------+-----------+
Example for wordpress’s wp_posts table
alter table wp_posts change post_content post_content LONGTEXT CHARACTER SET utf8;
Please be very careful for column conversion. Specially if you have non-english characters stored in database.

Source: https://rtcamp.com/tutorials/mysql/character-sets-collations/

Thursday, August 21, 2014

TortoiseGit - git not found

While this question is still hot... some nice people contributed lots of bugfixes to all three projects, so this is what I did to get TortoiseGit on Win7x64, previously failing on all combinations:
  1. install mSysGit (network installer) into C:\msysgit, it will download the source and compile it leaving you in a bash git prompt. Stable version: msysGit-netinstall-1.7.2.3-preview20100911.exe 13 sep
  2. install Git “preview” into C:\Program Files (x86)\Git, choose OpenSSH for ssh link Stable version: Git-1.7.2.3-preview20100911.exe 13 sep
  3. install tortoisegit into C:\Program Files\TortoiseGit, (x64 version) and configure it’s settings specifying the git path (C:\msysgit\bin) and menu integration. Stable version: TortoiseGit-1.5.6.0-64bit.msi 25 sep
This setup picked up my existing git repos made on WinXP x86 with older versions of the packages, and seems fairly stable and fully functional.

Source: http://stackoverflow.com/questions/1389281/tortoisegit-git-not-found