Our full technical support staff does not monitor this forum. If you need assistance from a member of our staff, please submit your question from the Ask a Question page.


Log in or register to post/reply in the forum.

limiting numbers of datatable records on server


jankplus Apr 15, 2015 05:31 PM

Hello,

[Apologies if this is a FAQ.]

Partners of ours are running a crbasic program on a CR1000. They have two identical data tables, both updated hourly, one limited to the most recent 24 hours and one unlimited.

They have loggernet connecting to this CR1000 via RF4xx, probably every five minutes. On the server they grab all new records during every connection. The server-side data tables don't "forget" the old records or limit how many records are retained like the CR1000 datatables do.

So after 90 days of running, the CR1000's tableA has 24 records, the CR1000's tableB has 2160 records, the server's tableA file has 2160 records, and the server's tableB file has 2160 records. With me so far?

The server-side files are automatically copied to an ftp server so I can retrieve them whenever I want. The connection is over satellite-provided internet that is very slow and tends to introduce corruptions during xfer, especially for large files. So what I really want is to download small files very often, such as hourly, but have access to the full history for download if I need it to resolve problems.

In short, I'd like the server-side files to reflect more or less exactly what the CR1000 tables hold, and not "remember" the older records. With a maximum of 24 (most recent) records at all times in one file and the entire site's history at all times in the other.

Is there a clever way I could set this up, either using crbasic datatable definitions, or through settings/preferences in the loggernet server that is retrieving data from those tables? I do have the ability to reprogram the CR1000 program however I please so I'm free to change datatables or introduce new ones.

Ideas? Best practices? Missing something obvious?

thanks,
Mike J+


jtrauntvein Apr 15, 2015 10:14 PM

At present, there is no available mechanism for LoggerNet to separate data into separate files directly based upon time stamps. Split can be used to perform this as a post processing step, however.

LoggerNet 4.3 does have a setting, Maximum data file size (bytes), available in the setup screen on the Tools/LoggerNet Server Settings/LoggerNet Settings. The purpose of this setting is to limit the maximum size of any data file written by LoggerNet. If a file grows beyond this limit, it will be backed up and a new file started.

Another alternative is to change the settings for the table so that a new data file is collected each time that the table is polled. This can work together with a task that will FTP the file when the data file is closed to split the files up. When doing this, the amount of data in the file is going to depend entirely upon the amount of data collected and will therefore depend upon the table output interval as well as the LoggerNet poll schedule.

Finally, you can have the datalogger control the generation of data files using the TableFile() instruction in CRBasic. If you do this, LoggerNet can still collect the data files using the file retrieval tab on the setup screen.

* Last updated by: jtrauntvein on 4/15/2015 @ 5:34 PM *


jankplus May 14, 2015 06:34 PM

Thanks for the suggestions!

Split seems not to apply because I don't see how it could be automated. The Maximum data file size would seem to apply to all files, and I'd just want one file to be limited in size while the others grow without limits.

Using Loggernet or the TableFile instruction to create a new data file at regular intervals is more promising. The trouble is, I need the file to always contain the most recent 24 records. Not to reset itself to zero every 24 hours and start growing again with never-before-seen records. The same TableData record would need to be included in 24 different files over time before finally aging out.

For example, if it's 5:24pm right now, the file should start with yesterday's 6pm record and end with today's 5pm record. Wait an hour and the file should now start with yesterday's 7pm record and end with today's 6pm record. Those two files have 23 records in common and differ only in the first/last record of each.

I don't see a way to use the Loggernet download settings for something like that, but TableFile (I've never heard of that instruction before, thanks) looks more promising. But I'd have to somehow convince my CR1000 to write the last 24 hours to the file every hour. Something like, manually delete the file every hour and then convince CRBasic to repopulate it completely with 24 hours of data, instead of waiting for new records to be generated one by one with CallTable.

Is there something kludgey I could do with the CR1000's status fields, or something?

Or digging a little deeper into CRBasic, maybe I need to manually create and recreate a file hourly to my own specs using FileOpen, FlieWrite and FileClose? Is there a simple way to just write out the most recent 24 records from a DataTable? TOA5 format would be best, of course.

Other ideas? Am I barking up an impossible tree?

Mike J+


jra May 14, 2015 07:09 PM

I think LoggerNet can do what you need.
In Setup, select your datalogger and go to the Data tab. There select your table and choose "Create Unique File Name" for File Output Option. Every time you collect it will create a new file. In the Collect Mode section choose "Most Recently Logged Records" and put 24 in the Records to Collect Box. Apply the settings.


pokeeffe May 14, 2015 07:12 PM

Have you tried using changing these scheduled collection options for the table in question?:

* File output option from 'Append to End of File' to 'Overwrite Existing File'
* Collect mode from 'Data Logged Since Last Collection' to 'Collect at most: 24 records'

The obvious drawback is higher bandwidth, which might be a deal-breaker for satellite comms.

EDIT: 'Most recently logged' collect mode is prob the better choice.

* Last updated by: pokeeffe on 5/14/2015 @ 1:14 PM *


jankplus May 14, 2015 07:44 PM

Two potential issues with that approach...

1. I need one file to contain all of the datatable's records back to the time of the logger's initial start-up, and a second file to contain only that datatable's most recent 24 records. I see where I might do this by just duplicating my datatables with slightly different names (hourlyX and hourlyY, then set up Loggernet to grow the hourlyX file without bounds but keep only 24 records from hourlyY) but that seems wasteful and potentially imprecise, and it would grow the size of my CR1 program by quite a lot. This is of course the exact approach I described in my first message but then when I looked at TableFile() I got excited about potentially eliminating the DataTable redundancy by writing the usual (complete) table to CRD but creating a differently-named file with only 24 records from that same table. If I could do that with one line of code at the top of one datatable that would be awesome.

2. 'Create Unique File Name' will, of course, create a unique file name, which might confound an automated routine to access the most recent file (by ftp). Also I think it would have to automatically delete the older files and not leave them all lying around. I don't see an option to limit (cycle though?) the number of such files, but maybe there is one?

It might help if I take a step back and describe the problem I'm trying to solve. This program will be running on a buoy in the Caribbean that is accessed regularly via RF401 by a local computer running Loggernet. This computer (or possibly another computer where files are auto-copied, not sure) is accessible by a very slow satellite-based ftp connection that sometimes corrupts downloads when they are very large.

Downloading the full hourly data file is not a problem for the first few months. After that the file grows too large, it takes too long, and file corruptions become too frequent, when all I want it to connect hourly and get the most recent (single) record. But collecting only one record isn't ideal, either, because due to server reboots and network outages there will often be periods of several hours when that download just doesn't happen.

So in the best case I'd ftp to the server once per hour, download the last 24 hours of data, compare that to what I've got, add in anything I've never seen before, and go to town. In the worst case I could at any time manually ftp the entire archive of data and compare it to what I've collected from the hourly downloads, and patch any holes.

My ideal scenario would be running a simple script on the remote computer to, every hour, just extract the most recent records from the *.dat file and write them to a different filename (think linux's 'tail' command). But that server will most likely (1) be running windows and (2) be outside of my control so scripting isn't going to be the more feasible approach.

So yeah, I'm not sure whether Loggernet or CRBasic alone can do this rather specific thing I want them to do but I figured there was no harm in asking. I do appreciate your (both Janet and jtrauntvein) suggestions so far and I'm glad to have made the acquaintance of TableFile().

thanks,
Mike J+


pokeeffe May 14, 2015 08:00 PM

Unique file name is a good choice, but:

* the file size/#recs will be determined by data collection frequency
* it doesn't do log rotation so you'd need to manually clean-up
* (obviously) the file name you'd want to retrieve is a moving target

Since the Loggernet instance is linked by radio, not satellite, it's simplest to use 'Overwrite Existing' & 'Get Last 24 Records' and collect hourly. If the small table only contains 24 recs, then you can safely use 'Overwrite Existing' & 'Get all data'.


Dana May 19, 2015 09:05 PM

I think the suggestion to use Split is a good solution.

In the Split PAR file you can set a start condition & stop condition, based on "pc time" (i.e., go back 24 hours from right now). Also, it CAN be automated. Create the PAR file to process the data, then use LoggerNet's Task Master to run the task. The task can be run based on time, after data collection from a datalogger, or based on one of several other different events.

The LoggerNet help contains information on running Split as a task in the Task Master (open the Task Master, go to the Help menu, and scroll down to Example #1). The Split help has information on processing files based on PC time (open Split, place your cursor in the Start condition line, and press F1).

Dana W.


pokeeffe May 20, 2015 12:32 AM

Split isn't necessary if the existing LoggerNet server settings can be modified.

Given:

* "two identical data tables, both updated hourly, one limited to the most recent 24 hours and one unlimited"
* "On the server they grab all new records during every connection."

The goal: "the server-side files to reflect more or less exactly what the CR1000 tables hold, and not "remember" the older records."

Solution: just change tableA's file output option to overwrite existing and grab the last 24 records.


jankplus May 20, 2015 07:29 PM

After some more testing... the 'overwrite existing' approach can give me the files that I want but unfortunately at unacceptable cost.

I have simplified some aspects of my logger program for purposes of clarifying exactly what my question was. In the non-simplified version you need to know that the program has six datatables with intervals ranging from 1 day to 5 seconds. The hourly datatable is most important to our day-to-day processing but we want access to all tables, preferably with the option to download (only the last FOO records -- or -- all records since go-live), and preferably with frequent updates (i.e. LoggerNet's default of every five minutes is good).

The 'overwrite existing' approach requires me to duplicate all of these tables so that some may be downloaded in their entirety and some with only the most recent records. That is nearly a dealbreaker right there because of the CR1 programming clutter it necessitates. Also I worry that others may someday alter these datatable definitions without realizing that they are supposed to be clones of one another. I do my best with comments but comments can be ignored.

Still, I gave it a shot. Each of the five tables were set up, in their 'recent only' versions, to download anywhere from 24 to 720 records, on every download, regardless of whether they'd already been downloaded. Unfortunately the re-download of all these data took a bit more than 4 minutes by RF401, so on an every-five-minute schedule chances were excellent that someone trying to access those data would get a partial file. By contrast a more traditional setup that downloads only the newly-produced records takes less than 15 seconds.

[One factor here is how LoggerNet's data timing is configured per-station, not per-datatable. Otherwise I might be able to update the hourly datatable only once per hour (while keeping 5-second datatable updating more frequently) and then time the ftp accesses so as NOT to coincide with those hourly updates, or something.]

As a kludge I've been able to setup a cygwin crontab (using cygrunsrv) that checks whether the LoggerNet files are currently being written, waits if necessary, then does some hacking with 'head' and 'tail' to give me the files I want alongside the originals. Not sure how well that will work in practice but so far it seems much better than depending on the Windows Task Scheduler kludge I attempted once in the past.

I did not know, however, that LoggerNet's Split could be automated. So I may experiment along those lines as well. The advantage being that I know LoggerNet is running on the field computer but I'm not 100% certain I'll be allowed to put cygwin on it or have admin access to windows services.

Thanks again to all for the many ideas raised in this discussion.


Dana May 21, 2015 08:18 PM

Regarding making sure there are no conflicts with the file still being opened, keep in mind that LoggerNet's Task Master has an "After File Closed" option so that you can perform some task based on data collection from a datalogger, that waits until after the data file that is being written is closed before performing the task.

Task Master can run executables, batch files, corascript, etc., and there is also a built-in FTP option.

Dana W.


pokeeffe May 22, 2015 04:31 AM

That is a bit different than how I understood it. Duplicating tables is not a good idea and you're evidently limited by the radio link bandwidth too.

Definitely look at TaskMaster for scheduling. It has some idiosyncrasies but it's reliable, has a good selection of triggers and can start anything via batch file.

Log in or register to post/reply in the forum.