Author

793 posts

Running multiple Claymore miner instances with different wallets and GPUs

A few days ago I switched from ethminer to Claymore’s Dual Ethereum Miner because ethminer has problems running multiple instances with multiple GPUs. My blog post How to run same ethminer instance with multiple GPUs is still valid but ethminer simply can’t handle two or more parallel running instances.

In addition to my previous blog post I want to show you how to mine with multiple GPUs into different Ethereum wallets. The following has been tested with Fedora 25.

Add a configuration file for each of your wallets

For each of your wallets create configuration a file named /etc/sysconfig/claymore-miner-$WALLETNAME with this content:

ETHER_ADDRESS=0x$YOURETHERADDRESS
SIA_ADDRESS=$YOURSIACOINADDRESS
EMAIL_ADDRESS=$EMAIL
## GPUs must *not* be separated by comma or whitespace!
## Use the first three GPU cards on your mainboard
GPUS=012

Create a systemd service template

Create the file /etc/systemd/system/claymore-miner@.service and paste the configuration into it:

[Unit]
Description=Ethereum miner for %i
After=network.target


[Service]
User=nobody
EnvironmentFile=/etc/sysconfig/claymore-miner-%I
ExecStart=/bin/bash --login -c "/opt/claymore-miner/ethdcrminer64 -epool eth-eu1.nanopool.org:9999 -ewal ${ETHER_ADDRESS}/$YOUR_RIG_NAME/${EMAIL_ADDRESS} -nofee 0 -mport 0 -espw x -ftime 10 -di ${GPUS} -mode 1"
# -mode 0 -dcoin sia -dpool sia-eu1.nanopool.org:7777 -dwal ${SIA_ADDRESS}/$YOUR_RIG_NAME/${EMAIL_ADDRESS}"
## If you want to dual-mine Siacoin, uncomment the line above and remove '-mode 1"' in the line before
Restart=always

[Install]
WantedBy=multi-user.target
  • Replace $YOUR_RIG_NAME with the name of your rig, whitespaces are not allowed
  • Uncomment the -mode 0 line to enable dual mining mode

Enable the service

We have delayed the start of each miner after booting by adding a simple crontab entry:

@reboot sleep 60s; systemctl start claymore-miner@$WALLETNAME1; systemctl start claymore-miner@$WALLETNAME2

If you like it, and want say thank you, you can drop me some wei at 0x4c1856c9021db812f0b73785081b245f622d58ec ūüôā

How to pass multiple parameters to systemd’s ExecStart – Running same ethminer instance with multiple GPUs with systemd

For our Ethereum mining rig a coworker of mine wrote a systemd template unit so it is relatively easy to configure which graphic card in the rig is assigned to whom.

For each of the GPU owners exist a custom configuration file /etc/sysconfig/ethminer-$USERNAME (/etc/sysconfig/ethminer-ckl in my case). The file contains the following parameters:

ETHER_ADDRESS=0xYOUR_HASH
EMAIL_ADDRESS=<your email address>
# use first three GPUs, last three would be 3 4 5
OPENCL_DEVICES=0 1 2

Each user has its own configuration file and all services can be started like

systemctl start ethminer@ckl

But when ethminer was started by using systemctl start only the first GPU in the defintion was used – GPU 0 in the configuration sample above. systemd itself called the ethminer binary in a correct way and the same command line worked when executed by hand. The problem occurred by how systemd passes arguments and how ethminer reads the them. In the end I fixed it by wrapping the ethminer command in a sub-bash process. Our unit definition in /etc/sysconfig/ethminer@.service looked like this:

[Unit]
Description=Mine Ether for %i
After=network.target


[Service]
User=nobody
EnvironmentFile=/etc/sysconfig/ethminer-%I
# BEFORE AND NOT WORKING
# ExecStart=/usr/bin/ethminer --farm-recheck 2000 -G -S eth-eu1.nanopool.org:9999 -O ${ETHER_ADDRESS}/rig02/${EMAIL_ADDRESS} --opencl-devices ${OPENCL_DEVICES}

ExecStart=/bin/bash --login -c "/usr/bin/ethminer --farm-recheck 2000 -G -S eth-eu1.nanopool.org:9999 -O ${ETHER_ADDRESS}/rig02/${EMAIL_ADDRESS} --opencl-devices ${OPENCL_DEVICES}"

Restart=always

[Install]
WantedBy=multi-user.target

Attempting to perform the InitializeDefaultDrives operation on the ‘FileSystem’ provider failed.

Yesterday one of our check_mk monitoring scripts based upon PowerShell failed, because the error Attempting to perform the InitializeDefaultDrives operation on the ‘FileSystem’ provider failed. showed up. The initial reason for this is unknown but has to do with the New-PSDrive and Remove-PSDrive PowerShell commands we use inside the check_mk scripts.

Symptoms for the problem are:

  • The network drive is shown as disconnected in the Windows Explorer but you can still open the network drive
  • The network drive can not be disconnected as you receive the error “Drive is not connected” or “Access Denied”
  • Get-PSDrive returns the network share
  • net use does not return the network share
  • The network drive can’t be removed with net use $YOUR_DRIVE /delete /y
  • The network drive can’t be removed with Remove-PSDrive
  • Restarting the Windows Explorer still shows the drive

To remove the network drive I tried to remove all registry keys which belonged to it, killing the check_mk agent, restarting the virtual machine and so on. In the end I fixed it by disabling the check_mk agent, restarting the VM and re-enabling the check_mk agent.

Update 2017-06-14: The error occurs because the check_mk agent runs under the Local System account. As a normal user or administrator you have no permission to remove the drive. A good indicator for this is the existence of the mapped network drive in the registry key HKEY_USERS\S-1-5-18\Network\$YOUR_DRIVE (S-1-5-18 is the Well Known SID for Local System account). You can delete the registry key but the drive still exists until you reboot the Windows instance. In some circumstances even then the drive still shows up.

Microsoft suggests to set the registry key HKLM\System\CurrentControlSet\Control\SessionManager\ProtectionMode from 1 to 0. This values allows an administrator to access system resources. Just don’t do it for the sake of security.

Luckily for us we can use Mark Russinovich’s psexec tool to gain to Local System permissions. With this we can safely remove the drive without a reboot.

psexec64 -i -s powershell.exe
# gain Local System permissions
Remove-PSDrive -Name $YOUR_DRIVE
# drive automatically disappears from Explorer and a Get-PSDrive execution as non-Local System

Cross-domain migration from Windows Server 2008 R2 to Windows Server 2012 R2

In the first weeks of our company I made the mistake to set up the Active Directory domain with a .local suffix which caused a lot of problems with Windows and Linux-based clients in the longer run. Besides that after I have shifted back my work to my original tasks РI am still a developer Рthe network infrastructure went into some kind of unplanned chaos. Too many people tried too many things for which they had no experience or no sense for the implications they made. Besides that we got a parallel FreeIPA domain nix.internal which had a domain trust with the Windows domain. Due to the complexity no one really tried to push the domain migration, even if it has been a long outstanding issue on our Kanban board.

Because of internal changes a few weeks ago I¬†took over the ownership of the infrastructure. To put the “structure” in “infrastructure” together with one of my co-workers I immediately started to plan the final migration phase.

Setting the goals

The new design of our infrastructure has been done over the last months.

  1. Instead of having the two domains domain.nix.internal and domain.local the new domains should be ad.domain.de and ipa.domain.de. ad.domain.de is managed by Active Directory, ipa.domain.de by FreeIPA. Both domains have a bi-directional trust.
  2. The servers containing the domain controller and the Microsoft Exchange server must be both migrated from Windows Server 2008 R2 to Windows Server 2012 R2.
  3. The Exchange server must be migrated from Microsoft Exchange 2010 to Exchange 2016.

Besides that we had a lot of other goals like a global naming schema for hosts, CNAME usage, consolidating our VMs and other things but this is out of scope for this article.

I prepared an own JIRA project containing all the tasks we had to solve.

Setting up the new domain

Setting up the new Active Directory and FreeIPA server was straight forward. A domain trust between domain.local and ad.domain.de just as between ad.domain.de and ipa.domain.de were established. I had to manually change the file permissions on our Linux-based file store as our permission concept has been also changed. Instead of assigning user permissions to files or directories we wanted to use security groups. Best practice. This took some time but was worth the effort as I fixed a lot of permission problems with the manual review.

After setting up the domain itself I prepared the (inactive) DHCP server, imported the existing settings via PowerShell from the domain.local controller, set up the DNS forwarding and so on.

Migration of Microsoft Exchange 2010 to 2016

It is one thing to make a migration from Exchange 2010 to a newer version but a complete different story to make a cross-domain migration of Exchange.¬†Google’s results for such a constellation are relatively comprehensible. Microsoft does not support such a migration since Exchange 2007, using PowerShell and own scripts does also not work. We ended up in buying CodeTwo Exchange Migration¬†which saved us a lot of time and pain. If you ever need to do a cross-domain migration of Exchange purchase a license. It is worth every cent.

After the initial installation of the new Microsoft Exchange 2016 server and joining it to the new ad.domain.de domain we set up the new Exchange server as a mail relay in our old Exchange Server 2010, both servers were listening to the same e-mail domain. This approach allowed us to test the new mail server with the existing domain. All other server settings were either exported and imported by using PowerShell or configured by hand/PowerShell.

On the migration day we disabled the POP3 collector on our old Exchange 2010, reconfigured the proxy server to point to the new Exchange 2016, did a last CodeTwo Exchange Migration run and enabled the POP3 collector on the new Exchange 2016. Apart from some hickups with the internal/external MAPI URL of the Exchange this went suprisingly smoothly.

Migrating clients into the new domain

The migraiton of the existing clients (PCs, notebooks) took much longer than expected. The migration of the local profiles did not work on all workstations. During the next logon the users were presented with the error “There are Currently No Logon Servers Available” (“Es stehen momentan keine Anmeldeserver zur Verf√ľgung”). I figured out that this was¬†a problem with the DNS configuration of the clients. During the join process, the new DHCP server were still disabled and the clients used the old domain controller for domain.local as DNS resolver. The server had a forwarded domain to ad.domain.de so the clients were able to join the new domain. But during the logon process, the client asked for a SRV record on the DNS controller and got the old domain returned, resulting in the error above. After disabling the old DHCP server, enabling the new DHCP server, manually setting the DNS server to the new domain controller and re-joining the new domain the logon issue was gone.

Status Quo

There is still a lot do, e.g. our Atlassian instances are¬†still connected to the old domain. As the security groups are completely different in both domains I’ll have to fix this by hand. Maybe I’ll write a blog post on that, too.

Lessons learned

  1. If you don’t really need to do a cross-domain migration, don’t do it.
  2. Planning, planning, planning. 2/3 of the time I invested for this project I spent with planning.
  3. Don’t underestimate the effort. Even if you have a good plan and everything is prepared, there is so much what can go wrong. The week between Christmas and New Year was the only time we could do this without having a large impact to our business.
  4. Don’t trust any profile migration tools. Reset the user’s password and log in with their credentials.
  5. If you don’t really need to do a cross-domain migration, don’t do it.

w32tm: “No valid response has been received from manually configured peer”

This was one of those network bugs which took some time to fix. To give you a short background: In the last weeks we had random time NTP drifts in our Linux infrastructure, nothing serious but it was odd as they disappeared after restarting the ntpd or chrony service. During the weekend my co-worker and I were upgrading some parts of the network infrastructure (new cables, new cabling between server and switches, upgrade to the latest Sophos UTM firmware, upgrading to XenServer 7.0 and so on). After getting everything up our OMD showed a lot of NTP errors which led to problems with FreeIPA, Kerberos and SMB. Besides that, on some of the virtual machines we had NFS problems. My co-worker dealt with this problems. Parallel to that and completely unrelated I was preparing a new Active Directory domain and wanted to setup the w32tm service on Windows Server 2012 R2.

To my suprise I was not able to synchronize the time with our firewall. Using

w32tm /stripchart /computer:$FW_IP /samples:10 /dataonly

worked flawlessly but when i tried

w32tm /resync /discover

I received the error

Der Computer wurde nicht synchronisiert, da keine Zeitdaten verf√ľgbar waren.
(The computer did not resync because no time data was available)

With w32tm debug logging enabled I got more information:

151915 13:50:46.8092075s - Logging information: NtpClient has not received response from server $FW_IP (ntp.m|0x0|0.0.0.0:123->$FW_IP:123).
151915 13:50:46.8092075s - Logging information: NtpClient: No response has been received from manual peer $FW_IP (ntp.m|0x0|0.0.0.0:123->$FW_IP:123) after 8 attempts to contact it. This peer will be discarded as a time source and NtpClient will attempt to discover a new peer from which to synchronize. Error code: 0x0000005C
151915 13:50:46.8092075s - Reachability:  removing peer $FW_IP (ntp.m|0x0|0.0.0.0:123->$FW_IP:123).  LAST PEER IN GROUP!
151915 13:50:46.8092075s - AddNewPendingPeer: manual
151915 13:50:46.8092075s - PeerPollingThread: PeerListUpdated
151915 13:50:46.8092075s - Logging error: NtpClient has been configured to acquire time from one or more time sources, however none of the sources are currently accessible and no attempt to contact a source will be made for 15 minutes. NTPCLIENT HAS NO SOURCE OF ACCURATE TIME.

At first I suspected a bug in the latest Sophos UTM firmware, so I checked the incoming packets on the firewall:

# NTP runs on UDP/123
tcpdump -vvv -i eth5 udp and port 123

Running w32tm /stripchart again I received the expected packets. Running w32tm /resync /discover showed nothing. My guess was that the UTM was not responsible for the error. I took a look into the changelog of XenServer 7.0 if anything had changed with the virtual interfaces of the virtual machines. The changelog itself did not have any interesting points, so it had to be an error on the lower network layers. Do you remember that I mentioned the recabling of server and switches? Well, some months ago one of our Netgear switches literally vaporized (btw, I wlll never buy a Netgear again) and had been replaced with a Cisco SG220-50. The SG220 is a nice piece of hardware and does not only work on layer 2 as the old Netgear but also on layer 3. After checking all available options we stumbled upon the settings in Security > Denial of Service > Security Suite Settings. The settings UDP Blat and TCP Blat caught our eyes: if enabled, the Cisco drops any UDP or TCP traffic where source port equals the destination port. Disabling both solved our NTP and NFS errors immediatly. By the way, w32tm /stripchart (and according to this ntpq in debug mode and with an unpriviliged source UDP port) ran flawlessly because it does not use source port 123 but another one.

Side project: A tool for preparing and calculating quotes with Neo4j

One part of my work is to inspect request for quotes (RfQs), gather customer requirements, estimate the effort and prepare a written offer. Most of our customers do not want to pay a separate bill for every sprint or user story but get a number or range for the total costs of the project. A big problem is the complexity of our projects. We do not only do software development but also do user interface design and application hosting.

In the last ten years I have tried and seen a lot of methods to calculate the estimated costs of software development projects: Function Point Analysis, COCOMO, Microsoft Excel, using modificators considering project management, risk and other factors, using tool combinations like JIRA or Confluence. In the end every method I have tried has disadvantages in one way or another.

This being said, one year ago I started to gather my own requirements for a cost calculation tool which fits in my workflow and should reduce my work:

  • I need a user interface to capture tasks (user stories, subtasks) and required material (server, licenses, hosting costs, etc.). Each of these items can be valued (e.g. effort in hours, material cost in Euro etc.).
  • Each of the items can be grouped together to form components. Each component can contain components itself.
  • Changing the cost or effort of any item should result into a recalculation of parent components and the whole offer.
  • I do not want to contaminate JIRA with estimations. JIRA is a project management tool and not a tool for estimating the cost of projects. Portfolio for JIRA does also not match these requirements.
  • After I have structured the requirements and estimated their costs/efforts I want to export the offer in a distributable format (PDF, XML, …).
  • After the customer has agreed to order components/tasks of our offer, I want to export these items from the calculation tool into JIRA.

The ideal workflow would be:

  • The customer gets in touch with us, we receive the specifications.
  • I break down the specifications into components and derive the tasks and the required material.
  • Our team estimates the effort.
  • Me and my superior are defining the quality levels of each tasks and the costs. The total costs are automatically calculated.
  • We are finalizing the offer and sending the link of it to the customer.
  • The customer selects the components/tasks (user stories) he wants to buy.
  • The cost/pricing of selected components/tasks can be exported to the billing tool.
  • Selected components/tasks can be exported to JIRA.

Decisions

The whole project idea had been matured over months before I started with a first architectural draft. One big decision was choosing the DBMS to store the data. Instead of using PostgreSQL I chose Neo4j. One of the reasons for choosing Neo4j was the requirement that components can be interleaved in unlimited depth. Yes, I know Common Table Expressions (CTE) but hierarchical structures like quotes or Bill of Materials (BoMs) can be easily implemented with graph databases.

 

The current graph schema does not look exactly like this but it should give you an idea of the internal structure.

Schema of Request for Quote tool

Schema of Request for Quote tool

As you can see, containers can have an unlimited depth. For every node (task, item) multiple metrics can be defined. For performance reasons the parent’s container of a node contains the aggregated metrics of its children. The sample data can be found in the Neo4j console.

Current state

Until today I have implemented a first version of the repository and service layer in the backend. With Java’s Lambda expression I realized an internal DSL to calculate different metrics which can be based upon each other. For example I can specicy that if the amount or retail price of an item changes, both are multiplied and stored into the turnover metric for materials:

forMetrics(MetricType.AMOUNT, MetricType.RETAIL_PRICE_PER_UNIT)
	.when(ifAtLeastOneChanged())
	.then(
		announce(
			multiply(MetricType.AMOUNT, MetricType.RETAIL_PRICE_PER_UNIT, MetricType.TURNOVER_MATERIAL)
		)
	);

All metrics can be calculated and aggregated up to the root node. The service layer allows the concurrent editing of parts of the whole hierarchy (moving, adding or deleting subtrees).

Goals

There is still a lot to do but I am convinced that this project is cool and offers a real value for every person planning complex offers. I have no idea when (and if) I will ever finish the project. I am trying to build the tool as a SaaS platform so it should be relatively easy to make some money with it.

If you are interested in more details, drop me a line at me[at]schakko[dot]de.

Executing a CQL wildcard search in CMDBuild’s REST API

For our internal search engine I am currently developing a simple microservice to make our CMDBuild instance searchable. The microservice provides a fairly simple JSON API which itself queries the REST API of CMDBuild. Because of the insufficient documentation of CMDBuild I had to dig into the the source how to write a wildcard search query. CMDBuild has its own query language called CQL (CMDBuild Query Language). The CQL statements are converted into SQL which can be executed natively by PostgreSQL. CQL does also allow to include native SQL statements into the CQL queries. Native SQL statements are masked with (/( … )/). Between us, the combination CQL and SQL produces a absolute messy code, but this another story.

One problem is, that the REST search API of CMDBuild is exposed through HTTP GET. Accessing the HTTP endpoint with a filter like

GET https://cmdbuild/services/rest/v2/cql?filter={CQL: "from Department where Description (/(LIKE 'Develop%')/)"}

does unfortunately confuse the Apache CXF interceptor which struggles upon the percent sign. Encoding the percent does not help and a POST request is not allowed.

To fix this problem I took a look into the source of CMDBuild. Luckily for me the CQL parser is automatically generated with help of ANTLR. The grammer file is much better than any incomplete example from the official forum. So I discovered that CQL natively provides the following operators: CONTAINS, BEGIN, END, BETWEEN, NULL.
In the end it worked as I had expected:

GET https://cmdbuild/services/rest/v2/cql?filter={CQL: "from Department where Description CONTAINS 'Develop'"}

Fixing “Unable to obtain lock on store lock file” when using the embedded database of Neo4j

After some years without using Neo4j I had the chance to use the graph database in my current project. Neo4j was a good fit because it makes it really easy to prototype an idea and the project itself relies heavily upon tree structures. Modelling tree structures in relational database management systems is always a PITA and takes a certain time to implement in the backend. The good news is that trees are only directed graphs which can be easily modelled with Neo4j.

As a Spring guy I used the latest spring-data-neo4j 4.1.x release to connect the application to the graph database. For easier prototyping I used the embedded driver.

import org.neo4j.ogm.session.SessionFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.neo4j.repository.config.EnableNeo4jRepositories;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@EnableTransactionManagement
@EnableNeo4jRepositories(basePackages = "de.schakko.neo4j")
@Configuration
public class Neo4jConfiguration extends org.springframework.data.neo4j.config.Neo4jConfiguration {
	@Override
	public SessionFactory getSessionFactory() {
		return new SessionFactory(getConfiguration(), "de.schakko.neo4j");
	}
	
	@Bean
	public org.neo4j.ogm.config.Configuration getConfiguration() {
		org.neo4j.ogm.config.Configuration config = new org.neo4j.ogm.config.Configuration();
		
		config.driverConfiguration()
		.setDriverClassName("org.neo4j.ogm.drivers.embedded.driver.EmbeddedDriver")
		.setURI("file:///var/tmp/graph.db");
		
		return config;
	}
}

When running the integration tests (or after automatic reloading of the application by spring-boot-devtools) the application failed to reconnect to the database:

org.neo4j.kernel.StoreLockException: Unable to obtain lock on store lock file: file:///var/tmp/graph.db

In the end I fixed the error by using

import org.junit.After;
import org.junit.runner.RunWith;
import org.neo4j.ogm.drivers.embedded.driver.EmbeddedDriver;
import org.neo4j.ogm.service.Components;

class ItemRepositoryIT {
	@After
	public void tearDown() {
		((EmbeddedDriver) Components.driver()).close();
	}
// ...
}

in the integration tests.

Open Tabs: Resources for Web Developers, How to write perfect blog posts, Argus

Resources for Web Developers

  • Freepik¬†is a search engine for finding free vector graphics, PSDs, icons and and photos.
  • If you have¬†to quickly create a landing page for your AppStore product, you can use the free service Landing Harbor.
  • Searching for a tool for your freelancer job? Take a look at The Freelance Stack.
  • Need a new Bootstrap-based theme? Bootstrap Zero should give you a good start.

How to write better blog posts

You have no idea how to start your next blog post? Try the free Content Idea Generator. And after you have found your topic you should definitely follow the rules of the awesome article After 10,000+ data points, we figured out how to write a perfect Medium post. The article contains a lot of hints how to optimize your blog posts.

Monitor your time series with Argus

Argus is an open source project proided by Salesforce to monitor your time series and receive alerts. I will take a deeper look at this project in the coming weeks.

lumiverse.io

On lumiverse.io you can find some interesting videos about neural networks.

Making money with GitHub

I stumbled upon these two projects on HN: CodeMill and Bountysource both provides a marketplace service for GitHub pull requests. You can search for open tickets, make a pull request and get paid by the community or the project owner.

GitHub project management

zube.io is a payed service to organize your GitHub projects. The open source alternative gh-board can be hosted on your own servers and has a lot of features like linking multiple GH repositories to one Kanban board.

Migrating InfluxDB from 0.9.6 to 0.10.0 GA

You may have already heard that InfluxDB 0.10 GA has been published a few days ago. In my case the most interesting improvement are the much higher compression rates: At the moment my co-workers of NeosIT and I are collecting performance data from four internal virtual machines. Have been running the collectd daemon on the VMs for about a month, the size of InfluxDB’s data directory increased by 3.5 GByte with default retention policy.

Testing the update

After setting up a new virtual machine with InfluxDB 0.9.6 for migration testing, I updated InfluxDB to the current version (Fedora 23 in our case):

sudo wget https://s3.amazonaws.com/influxdb/influxdb-0.10.0-1.x86_64.rpm
sudo dnf install influxdb-0.10.0-1.x86_64.rpm

Two notes:

  1. As mentioned during the update procedure, the location of the configuration file moved from /etc/opt/influxdb/influxdb.conf to /etc/influxdb/influxdb.conf. The old configuration file is neither copied nor merged. You must edit the new configuration file or copy the old configuration to the new location. Keep in mind that 0.10.0 introduces some new settings.
  2. In addition to the changed configuration location the data directory has been moved from /var/opt/influxdb/ to /var/lib/influxdb. My advice is to move the old folder to the new location and overwriting any new files. Please create a backup from the /var/lib/influxdb folder before doing this. In my case this procedure worked without any problems.

Unfortunately it doesn’t seem to be possible to just copy a single database from one separate InfluxDB instance to another. Because of this drawback I imported some data from our virtual machines through Logstash and InfluxDB’s collectd backend. The Grafana dashboards still worked as expected so there seems to be no breaking changes in the HTTP API, at least as far as I could see.

Converting data from bz1 to tsm1

Now it was time to convert the data from the old bz1 format to the new tsm1. InfluxDB 0.10 delivers a nifty tool named influx_tsm. The usage is very easy and it worked as expected. Just pass the parameter “-backup” or “-nobackup”, the data location to be converted and you are done:

systemctl stop influxdb
influx_tsm -backup ~/inflxudb_backup -parallel /var/lib/influxdb/data/
systemctl stat influxdb

Results of the conversion

In my test environment everything worked as expected so I migrated our production environment. The influx_tsm took around ~50 minutes for 3.6 GByte of data but the results are awesome. Our data directory shrinked from 3.6 GByte to 400 MByte. All Grafana dashboards work as expected as well as our collectd and Graphite InfluxDB backends. This is the output of influx_tsm:

# ...
2016/02/07 13:34:29.100186 Still Working: Completed Shards: 29/30 Points read/written: 81626859/81626859
2016/02/07 13:34:31.969792 Conversion of /var/lib/influxdb/data/collectd/default/60 successful (49m41.016488953s)

Summary statistics
========================================
Databases converted: 3
Shards converted: 30
TSM files created: 30
Points read: 81655978
Points written: 81655978
NaN filtered: 0
Inf filtered: 0
Points without fields filtered: 0
Disk usage pre-conversion (bytes): 5239898112
Disk usage post-conversion (bytes): 451150534
Reduction factor: 91%
Bytes per TSM point: 5.53
Total conversion time: 49m48.08131854s

Thanks guys, good job!