# Problem opening files



## kr0m (Oct 20, 2020)

Hello
I am experimentign problems opening files from all kind of apps, I have tested it with some text editors, geany, code-oss, atom and codelite, all of them crashes when I try to open a file from it, but in geany I can open recent files without crashing, once I have opened a recent file successfully geany doesn't crash anymore, since I close all the files an try to open another one.
I don't know if the other editors behaves the same way because I don't have recent files opened.
In the other hand if I try to open recent files from transmission-gtk it crashes, I have took a look to core files using GDB and I only get useful information opening geany and transmission core files:

`gdb transmission-gtk transmission-gtk.core`


```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000008017f46df in flockfile () from /lib/libc.so.7
[Current thread is 1 (LWP 100941)]
(gdb) backtrace
#0  0x00000008017f46df in flockfile () at /lib/libc.so.7
#1  0x00000008017d7cbb in fgets () at /lib/libc.so.7
#2  0x00000008016f58c4 in  () at /lib/libc.so.7
#3  0x00000008016f540c in getfsent () at /lib/libc.so.7
#4  0x0000000800f2ed40 in g_unix_mount_points_get () at /usr/local/lib/libgio-2.0.so.0
#5  0x0000000800f2eda9 in g_unix_mount_point_at () at /usr/local/lib/libgio-2.0.so.0
#6  0x0000000800f8d7a1 in  () at /usr/local/lib/libgio-2.0.so.0
#7  0x0000000800f8d5b4 in  () at /usr/local/lib/libgio-2.0.so.0
#8  0x0000000800f911c5 in  () at /usr/local/lib/libgio-2.0.so.0
#9  0x0000000800f8eace in  () at /usr/local/lib/libgio-2.0.so.0
#10 0x0000000800ec1a79 in  () at /usr/local/lib/libgio-2.0.so.0
#11 0x0000000800f0cc45 in  () at /usr/local/lib/libgio-2.0.so.0
#12 0x0000000801116463 in  () at /usr/local/lib/libglib-2.0.so.0
#13 0x0000000801115252 in  () at /usr/local/lib/libglib-2.0.so.0
#14 0x0000000800d9d736 in  () at /lib/libthr.so.3
#15 0x0000000000000000 in  ()
```

`gdb geany geany.core`


```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000008016437a0 in strcmp () from /lib/libc.so.7
[Current thread is 1 (LWP 101721)]
(gdb) backtrace
#0  0x00000008016437a0 in strcmp () at /lib/libc.so.7
#1  0x0000000801166c50 in g_unix_mount_points_get () at /usr/local/lib/libgio-2.0.so.0
#2  0x0000000801166da9 in g_unix_mount_point_at () at /usr/local/lib/libgio-2.0.so.0
#3  0x00000008011c57a1 in  () at /usr/local/lib/libgio-2.0.so.0
#4  0x00000008011c55b4 in  () at /usr/local/lib/libgio-2.0.so.0
#5  0x00000008011c91c5 in  () at /usr/local/lib/libgio-2.0.so.0
#6  0x00000008011c6ace in  () at /usr/local/lib/libgio-2.0.so.0
#7  0x00000008010f9a79 in  () at /usr/local/lib/libgio-2.0.so.0
#8  0x0000000801144c45 in  () at /usr/local/lib/libgio-2.0.so.0
#9  0x0000000801347463 in  () at /usr/local/lib/libglib-2.0.so.0
#10 0x0000000801346252 in  () at /usr/local/lib/libglib-2.0.so.0
#11 0x00000008005c6736 in  () at /lib/libthr.so.3
#12 0x0000000000000000 in  ()
```

Loading other core files seems that symbols are missing, i dont know, opening code-oss i only view bactrace of that sort:


```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000080bd83784 in ?? ()
[Current thread is 1 (LWP 101247)]
(gdb) backtrace
#0  0x000000080bd83784 in ?? ()
#1  0x0000000807af6df1 in ?? ()
#2  0x0000000812436840 in ?? ()
#3  0x0000000000000001 in ?? ()
#4  0x000000080bfbf0e0 in ?? ()
#5  0x000000080bfbf0e0 in ?? ()
#6  0x0000000000000000 in ?? ()
```

My FreeBSD version is 12.1.
`uname -a`
FreeBSD bagheera.alfaexploit.com 12.1-RELEASE-p10 FreeBSD 12.1-RELEASE-p10 GENERIC  amd64

I don't know if it was broken in any update, any idea?

Best regards.


----------



## SirDice (Oct 22, 2020)

Check your /var/log/messages, specifically look for disk issues. If there are no issues you may still have some filesystem corruption, boot to single user mode and run fsck(8) (assuming you have UFS). Let it fix any errors it finds.


----------



## kr0m (Oct 22, 2020)

When the editor crashes i can see in /var/log/messages:

```
Oct 22 17:05:25 bagheera kernel: pid 4831 (geany), jid 0, uid 1000: exited on signal 11 (core dumped)
```

If i open the editor using the terminal i can see some strange errors about fstab file:

```
fstab: /etc/fstab:fstab: 6: fstab: /etc/fstab:6: Inappropriate file type or format
Inappropriate file type or format
fstab: fstab: /etc/fstab/etc/fstab::5: 5: Inappropriate file type or formatInappropriate file type or format
fstab: fstab: /etc/fstab/etc/fstab::11: 11: Inappropriate file type or format
Inappropriate file type or format
fstab: /etc/fstab:19: Inappropriate file type or format
/etc/fstab:19: Inappropriate file type or format
Segmentation fault (core dumped)
```

But if i repeat exactly the same operation the fstab error lines changes:

```
fstab: /etc/fstab:fstab: /etc/fstab:6: 6: Inappropriate file type or format
Inappropriate file type or format
fstab: /etc/fstab:15: Inappropriate file type or format
fstab: /etc/fstab:19: Inappropriate file type or format
Segmentation fault (core dumped)
```

By the way i am using ZFS.


----------



## SirDice (Oct 22, 2020)

It all seems to infer random read errors (which then causes your editor or other applications to crash trying to read it). Your disk may be dying. You may want to check it with sysutils/smartmontools; `smartctl -a /dev/<disk_device>`


----------



## kr0m (Oct 22, 2020)

I will try it tonight, but the other OS apps seems to work without crashing, if the disk is failling the other apps should crash, isnt it?


----------



## SirDice (Oct 22, 2020)

Can't say, they could appear randomly. And some applications might be able to deal with errors better than others.


----------



## olli@ (Oct 22, 2020)

Can you open the files with the command line tools from the FreeBSD base system, e.g. /usr/bin/vi or /usr/bin/ee? If this works, there might be a problem with your packages. Did you install pre-built packages with `pkg`, or did you build them yourself? If the latter, then there might be a library conflict that may cause crashes like that. Well, it’s unlikely, but it’s still possible.

Other than that, I agree that it sounds like a disk issue. Usually, when there are random crashes, the first thing that comes to mind is bad RAM, overclocked CPU, maybe PSU issues and things like that. But in that case there would be random crashes everywhere (especially under load), not just when opening certain files.

Look at dmesg(8) and /var/log/messages for any errors reported by the disk drivers. You might also want to install sysutils/smartmontools and run `smartctl -a /dev/ada0` (or whatever the name of your disk device is). For a start, look at the lines “Reallocated_Sector_Ct” and “Current_Pending_Sector”; both should should be 0 (raw value column). You may also want to initiate a self test; see the smartctl(8) manual page for details.

First and foremost, it’s probably a good idea to make a backup of your valuable data, if still possible.


----------



## kr0m (Oct 22, 2020)

Hello, i have cheked the disk with smartmontools but i am not sure if it is healthy:


```
smartctl -a /dev/ada3
smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-RELEASE-p10 amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Phison Driven SSDs
Device Model:     KINGSTON SA400S37120G
Serial Number:    50026B767601A331
LU WWN Device Id: 5 000000 000000000
Firmware Version: SBFK71E0
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Oct 22 21:28:12 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 248)    Self-test routine in progress...
                    80% of test remaining.
Total time to complete Offline
data collection:         (65535) seconds.
Offline data collection
capabilities:              (0x79) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  30) minutes.
Conveyance self-test routine
recommended polling time:      (   6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       12758
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       908
148 Unknown_Attribute       0x0000   255   255   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   255   255   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0022   100   100   000    Old_age   Always       -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       6
170 Bad_Blk_Ct_Erl/Lat      0x0013   100   100   010    Pre-fail  Always       -       0/7
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       175 (Average 128)
181 Program_Fail_Count      0x0012   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   255   255   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       95
194 Temperature_Celsius     0x0023   064   052   000    Pre-fail  Always       -       36 (Min/Max 20/48)
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0000   100   100   000    Old_age   Offline      -       0
231 SSD_Life_Left           0x0013   100   100   000    Pre-fail  Always       -       87
233 Flash_Writes_GiB        0x0013   100   100   000    Pre-fail  Always       -       13216
241 Lifetime_Writes_GiB     0x0012   100   100   000    Old_age   Always       -       9765
242 Lifetime_Reads_GiB      0x0012   100   100   000    Old_age   Always       -       3214
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       128
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       175
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       758124

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     12757         -
# 2  Short offline       Completed without error       00%     12756         -
# 3  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
```

I can read: No Errors Logged in the output.

Olli, i can open files without problems using non graphical editors like vi, i think the bug is linked to partitons mount points, it seems that all core files of the crashed apps have g_unix_mount_points_get call in common, i think that the OS is buggy in that way and the apps crashes because of that. I am only suposing, i am not an expert debugging nor in FreeBSD.

I have checked dmesg output and /var/log/messages log file but the only strange entry that i have found is in messages file:

```
Oct 22 08:06:15 bagheera console-kit-daemon[6449]: WARNING: Error waiting for native console 1 activation: Inappropriate ioctl for device
```

But i think it is not related with my problem.

Thanks for your advises.


----------



## ralphbsz (Oct 23, 2020)

The SSD looks reasonable. Total writes of 13TB on a 120GB disk means a little over 100 cycles per bit on average (it says the max erase count is 175), so probably not near wearout limit.

I would like to offer a different theory: Something about your install is corrupted, and some library that is needed by console-kit and GUI based programs is messed up. Why do I say that? Because as far as I can see, all the problems come from GUI programs and from console-kit. Can you try booting single-user, and editing with vi (which is in base)?


----------



## wolffnx (Oct 23, 2020)

kr0m said:


> Hello, i have cheked the disk with smartmontools but i am not sure if it is healthy:
> 
> 
> ```
> ...



I think you have to run at least one test 


`smartctl -t long /dev/ada0`

this take sometime,you can view the progress with

`smartctl -a /dev/ada0`


----------



## SirDice (Oct 23, 2020)

I'm a little suspicious about these: 


kr0m said:


> ```
> 192 Unsafe_Shutdown_Count 0x0012 100 100 000 Old_age Always - 95
> ```



Do you properly shutdown the machine? Maybe had some power failures? While ZFS is quite resilient when it comes to filesystem corruption it's not infallible. Having the machine just powering off without a clean shutdown can, over time, cause filesystem corruption. The more often this happens the higher the risk of corruption.


----------



## kr0m (Oct 23, 2020)

I am executing disk test:

```
Test will complete after Fri Oct 23 11:54:30 2020 CEST
```

When it finishes i will execute smartctl -a command.

In the other hand i think that i am shutting down the machine in the correct way, i am executing:

```
shutdown -p now
```

Is that the correct way?

wolffnx, when the disk test ends i will reboot in single user mode to open a file using vi, but vi always works even when working in graphical mode.


----------



## kr0m (Oct 23, 2020)

Disk test has ended, that is the result:


```
smartctl -a /dev/ada3

smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-RELEASE-p10 amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Phison Driven SSDs
Device Model:     KINGSTON SA400S37120G
Serial Number:    50026B767601A331
LU WWN Device Id: 5 000000 000000000
Firmware Version: SBFK71E0
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Oct 23 11:56:52 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (65535) seconds.
Offline data collection
capabilities:              (0x79) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  30) minutes.
Conveyance self-test routine
recommended polling time:      (   6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       12764
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       909
148 Unknown_Attribute       0x0000   255   255   000    Old_age   Offline      -       0
149 Unknown_Attribute       0x0000   255   255   000    Old_age   Offline      -       0
167 Write_Protect_Mode      0x0022   100   100   000    Old_age   Always       -       0
168 SATA_Phy_Error_Count    0x0012   100   100   000    Old_age   Always       -       0
169 Bad_Block_Rate          0x0000   100   100   000    Old_age   Offline      -       6
170 Bad_Blk_Ct_Erl/Lat      0x0013   100   100   010    Pre-fail  Always       -       0/7
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 MaxAvgErase_Ct          0x0000   100   100   000    Old_age   Offline      -       175 (Average 128)
181 Program_Fail_Count      0x0012   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0000   255   255   000    Old_age   Offline      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
192 Unsafe_Shutdown_Count   0x0012   100   100   000    Old_age   Always       -       96
194 Temperature_Celsius     0x0023   063   052   000    Pre-fail  Always       -       37 (Min/Max 20/48)
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
199 SATA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0
218 CRC_Error_Count         0x0000   100   100   000    Old_age   Offline      -       0
231 SSD_Life_Left           0x0013   100   100   000    Pre-fail  Always       -       87
233 Flash_Writes_GiB        0x0013   100   100   000    Pre-fail  Always       -       13234
241 Lifetime_Writes_GiB     0x0012   100   100   000    Old_age   Always       -       9776
242 Lifetime_Reads_GiB      0x0012   100   100   000    Old_age   Always       -       3215
244 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       128
245 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       175
246 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       759108

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     12764         -
# 2  Short offline       Completed without error       00%     12758         -
# 3  Extended offline    Completed without error       00%     12757         -
# 4  Short offline       Completed without error       00%     12756         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
```


----------



## SirDice (Oct 23, 2020)

kr0m said:


> Is that the correct way?


That's excellent actually. Is this a laptop perhaps? Then I can imagine leaving it on, forgetting about it, and when the battery is drained it will shut off fairly suddenly. That value may not mean anything though. Just something to keep in mind, sudden loss of power could certainly cause filesystem corruptions, even with ZFS. Because this is ZFS you may want to run a scrub, just to see if things are still in order.


----------



## kr0m (Oct 23, 2020)

SirDice, thats not a laptop, its a desktop system.
wolffnx, i have rebooted and booted with: boot -s command, i have selected my shell and opened /etc/hosts file using vi, but it shows read only filesystem mode error:


----------



## SirDice (Oct 23, 2020)

kr0m said:


> i have rebooted and booted with: boot -s command,


When you boot in single user mode only the root filesystem is mounted, and is mounted read-only.

For ZFS: `zfs readonly=no zroot/ROOT/default` and perhaps a `zfs mount -a` if you need any other filesystems.


----------



## wolffnx (Oct 23, 2020)

kr0m said:


> SirDice, thats not a laptop, its a desktop system.
> wolffnx, i have rebooted and booted with: boot -s command, i have selected my shell and opened /etc/hosts file using vi, but it shows read only filesystem mode error:



like says SirDice  , or you can boot in normal mode and work from the console with vi or nano


----------



## kr0m (Oct 23, 2020)

ZFS seems in order, i have executed:
`zpool scrub zroot`

When scrub finished i get that output from status command:

```
zpool status
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:11:57 with 0 errors on Fri Oct 23 12:26:22 2020
config:

    NAME        STATE     READ WRITE CKSUM
    zroot       ONLINE       0     0     0
      ada3p3    ONLINE       0     0     0

errors: No known data errors
```

When i can i will try single user mode with `zfs readonly=no zroot/ROOT/default` command.


----------



## kr0m (Oct 23, 2020)

I have booted in single user mode, with zfs readonly=off zroot/ROOT/default command i can edit files or at least vi doesnt complain about it, but i get a strange editor mode:
I have an spanish keyboard and vi is in USA keyboard mode, so i cant exit of vi, i am jailed inside vi interface jajaja, i rebooted using Ctrl+Alt+Spr.
I dont know if it is enought proof to discard disk problems.


----------



## ralphbsz (Oct 23, 2020)

kr0m said:


> I dont know if it is enought proof to discard disk problems.


Proof? Perhaps not in the mathematical sense. But it makes it very unlikely that you have a pervasive disk problem. And that means that it is a better investment of time to look elsewhere.

Here's my suggestion: Build the system back up in layers. Single-user mode worked. Great. Boot into console mode, without a GUI, without X. Use that to verify that you can read and write (edit) files, both in a system directory like /etc or /usr/local/etc, and in a home directory. If that works well, then try the same thing with the GUI running, but still with a simple editor (like vi, nano, ee) in a shell window. If that still works, then start using GUI tools.

And you're going to have to figure out the keyboard problem on the console, sooner or later. Being trapped inside vi and having to reboot is sort of a trap door in the floor. One of these days something WILL go wrong, and you'll have to use single-user mode to fix it, and then not having a functioning keyboard will be deadly.

Anecdote: My FreeBSD server sits on a shelf in the basement, with a small monitor that's wall-mounted nearby (for emergency repairs only, for real work I log in over the network), and a physically tiny keyboard that takes up no storage space (the traditional IBM/Lenovo ThinkPad keyboard, in a external USB packaging). Great. A few months ago, we cleaned up old unused keyboard (we had about 20 of them), threw the boring ones in the recycler, and gave the good ones (mechanical switches, old metal bases) away to collectors. So at this point, we had only 5 keyboards left at home, all actually in use, and 3 of them are very big and used for desktops. And then one day the FreeBSD server needed maintenance, and IT HAD NO KEYBOARD. It was just missing, nowhere to be found. I tried the only other small keyboard (which is usually on a Raspberry Pi), and it kind of works, but because it has a Mac layout, the control key is in the wrong place. My wife suspected that the Lenovo laptop keyboard might have been thrown away in the great cleanup. So for several weeks, whenever I needed to actually log into the console, I had to first run upstairs and retrieve a great big mechanical keyboard and plug it in, and then put it back for doing my day job. And this always happens at the least convenient time, like when we have a power outage and need to shut down quickly. About two weeks ago, we finally found the small Lenovo keyboard: It was in a stack of documents on the shelf. It is so small and flat that it actually vanishes if you put a few cm of paper documents on top of it.


----------



## _martin (Oct 23, 2020)

As you said, you can open files with standard editors. I'm guessing you have no problems using other programs too, such as cat, less or similar which also open files. 
The g_unix_mount_points_get() is suspicious, indeed.

Though personally I like devel/strace more, native truss(1) does the job too. It is worth tracing it with it (opening text editor with truss). 

From little output I saw I have a feeling that that function passes NULL ptr ( to strcmp() and flockfile() ). 
You could share `info registers rsi rdi rdx` command from gdb to verify that. 

That truss could give the hint why that's happening.


----------



## kr0m (Oct 23, 2020)

Yes ralphbsz i have to solve the keyboard layout problem in single user mode.

The behavior is a little bit strange, how i said in my first post when i open recent opened files with geany i can work with it without problems, geany only crashes when i open a file without opening a recent one previously. With the other graphical editors i cant make the same test because i have been using geany since first day and i dont have any recent file opened in the other editors.

cat, less and vi works 100% of times.

Here i paste the gdb registers info of geany crash:

```
Core was generated by `geany'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000008016556df in flockfile () from /lib/libc.so.7
[Current thread is 1 (LWP 101823)]
(gdb) info registers rsi rdi rdx
rsi            0x400               1024
rdi            0x0                 0
rdx            0x0                 0
```

Launching geany with `truss geany`, opening a file and crashing i get the following output(only last lines attached):

```
access("/home/kr0m",W_OK)             = 0 (0x0)
:fstatat(AT_FDCWD,"/home/kr0m",{ mode=drwxr-xr-x ,inode=8,size=209,blksize=16384 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/6T/.Trash",0x7fffdebf3c20,AT_SYMLINK_NOFOLLOW) ERR#2 'No such file or directory'
write(2,":",1)                     = 1 (0x1)
issetugid()                     = 0 (0x0)
mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34484510720 (0x8076fe000)
geteuid()                     = 1000 (0x3e8)
19: fstatat(AT_FDCWD,"/home/kr0m/sshd",{ mode=-rw-r--r-- ,inode=232674,size=2172,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
issetugid()                     = 0 (0x0)
write(2,"19: ",4)                 = 4 (0x4)
Inappropriate file type or formatfstatat(AT_FDCWD,"/mnt/6T/.Trash-1000",{ mode=drwx------ ,inode=139526145,size=4096,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)
write(5,"\^A",1)                 = 1 (0x1)
write(2,"Inappropriate file type or forma"...,33) = 33 (0x21)
openat(AT_FDCWD,"/home/kr0m/sshd",O_RDONLY,00)     = 28 (0x1c)

read(28,"# PAM configuration for the Secu"...,4096) = 2172 (0x87c)
write(2,"\n",1)                     = 1 (0x1)
close(28)                     = 0 (0x0)
SIGNAL 11 (SIGSEGV) code=SEGV_MAPERR trapno=12 addr=0xa0
issetugid()                     = 0 (0x0)
<thread 101310 exited>
<thread 101420 exited>
<thread 101422 exited>
<thread 101421 exited>
<thread 101341 exited>
<thread 101425 exited>
<thread 101426 exited>
<thread 100908 exited>
<thread 101427 exited>
<thread 101424 exited>
<thread 102339 exited>
process killed, signal = 11 (core dumped)
```


----------



## kr0m (Oct 23, 2020)

I have discovered thanks to truss output `Inappropriate file type or formatfstatat(AT_FDCWD,"/mnt/6T/.Trash-1000",{ mode=drwx------ ,inode=139526145,size=4096,blksize=4096 },AT_SYMLINK_NOFOLLOW) = 0 (0x0)` that if i dont mount any additional partition disk i can open files without problem.

Thats my `/etc/fstab` file content now:


```
# Device        Mountpoint    FStype    Options        Dump    Pass#
linprocfs        /compat/linux/proc    linprocfs    rw    0    0
linsysfs            /compat/linux/sys    linsysfs    rw    0    0
tmpfs            /compat/linux/dev/shm    tmpfs    rw,mode=1777    0    0
fdesc            /dev/fd            fdescfs        rw    0    0
proc            /proc            procfs        rw    0    0

/dev/ada3p2        none    swap    sw        0    0
#Alt+35
# fsck.ext4 -y /dev/ada2p1
#/dev/ada2p1        /mnt/6T        ext2fs    rw        0    0
# fsck.ext4 -y /dev/ada1s1
#/dev/ada1s1        /mnt/1T        ext2fs    rw        0    0
# No es necesario hacer nada
#/dev/ada4s1             /mnt/2T         ntfs    mountprog=/usr/local/bin/ntfs-3g,late,rw                0       0

# Chrome: save files in tmpfs
#md /home/kr0m/.cache/chromium mfs rw,late,noatime,noexec,nosuid,-wkr0m:kr0m,-s2g 2 0
```

I have commented /mnt/6T /mnt/1T and /mnt/2T, any idea why it is making crash the apps?


----------



## kr0m (Oct 23, 2020)

Tested, the only source of problems is the `#/dev/ada2p1        /mnt/6T        ext2fs    rw        0    0` line in my fstab file, i will try to fsck.ext4


----------



## richardtoohey2 (Oct 23, 2020)

So was everything working before some sort of update?  What update(s) did you do if any?

You seem to have moved away from discussing the /etc/fstab errors - are those still reported?  Is your /etc/fstab correct?

As others are saying, it _seems_ more like you've got some issue with a shared library.

If it seems to be to do with opening recent files, is there anything wrong with your home file directory or location of dot files?  _Maybe_ the editors are crashing trying to read or handle their recent file lists?  Can you figure out e.g. where geany's recent file list is stored and clear it?

If you cat any of the files that cause issues, does the cat work OK?  Think it will based on what you've said about things like vi working.


----------



## kr0m (Oct 23, 2020)

I have the problem totally debugged, if i make a symlink in ZFS filesystem to a ext4 filesystem, it makes the crash app, its not the mount point per se, its the symlink.


```
bagheera $ ~> ln -s /mnt/6T/evilDir/ evilDir
bagheera $ ~> ls -la evilDir
lrwxr-xr-x  1 kr0m  kr0m  16 Oct 23 22:57 evilDir -> /mnt/6T/evilDir/
```

In that way it crashes, but its not related to symlinks in general it only happens with evilDir name, if i make the symlink in that other way:

```
bagheera $ ~> ln -s /mnt/6T/evilDir/ evilDir2
bagheera $ ~> ls -la evilDir2
lrwxr-xr-x  1 kr0m  kr0m  16 Oct 23 23:09 evilDir2 -> /mnt/6T/evilDir/
```

It doesnt crash, i suspect about making gtk bookmarks to symlinks in another filesystems.


----------



## kr0m (Oct 23, 2020)

*richardtoohey2*
So was everything working before some sort of update? What update(s) did you do if any?
I dont remember if it started to fail after an update in concrete.

You seem to have moved away from discussing the /etc/fstab errors - are those still reported? Is your /etc/fstab correct?
I reported some strange messages about fstab in my second post.

If it seems to be to do with opening recent files, is there anything wrong with your home file directory or location of dot files? _Maybe_ the editors are crashing trying to read or handle their recent file lists? Can you figure out e.g. where geany's recent file list is stored and clear it?
I have cleared recent file list from geany config: `.config/geany/geany.conf` `recent_files=` parameter but it still crashes.

If you cat any of the files that cause issues, does the cat work OK? Think it will based on what you've said about things like vi working.
cat and other tools works perfectly, geany and graphical editors crashes regardless the opened file.

I cant figure out why symlink with a concrete name makes the app crash and other symlink name to the same destination directory not.


----------



## kr0m (Oct 23, 2020)

I only use pkg for everything. And yes i update my system regularly but i cant say when it started to crash.


----------



## VladiBG (Oct 24, 2020)

Did you run any memtest?


----------



## _martin (Oct 24, 2020)

I realized the crash happened in the function, not when it was entering the function. Hence those regs don't say much about the state (disassemble few instruction before and after would help). But it's hard to debug it like this non-interactively. 

But truss does show you the SIGSEGV on address 0xa0, so access to a not valid memory. Hence most likely a bug. The report anonymous9 showed you is interesting. 

I don't have any X on my FreeBSD machines so I can't test myself. But it might be worth also asking on mailing list. And/or replying to a bug mentioned above.


----------



## _martin (Oct 24, 2020)

Out of curiosity I spawned the VM and installed gnome3 there. It's a 12.1-RELEASE (r354233) amd64 VM. I've recreated the setup you have (/mnt/6T to be the ext4 FS mounted with fusefs-ext2), the whole system is on ZFS. I tried to open/edit several files with geany but was not able to reproduce the bug. I've used the 12.1 install image and then used pkg to install binary packages.

There are probably way too many programs to list, but the most obvious:

```
dbus-glib-0.110                GLib bindings for the D-BUS messaging system
glib-2.66.0_1,1                Some useful routines of C programming (current stable version)
geany-1.36                     Fast and lightweight GTK+ IDE
gnome3-3.36                    "meta-port" for the GNOME 3 integrated X11 desktop
```

EDIT: I didn't have the ext4 FS in fstab but rather used a md device and set the mountpoint manually. Once I've put it in fstab I was able to reproduce the crash.  
I used geany, tried to open a file following the symlink. flockfile() has NULL stream passed (stored in rbx now), hence the 0xa0 segfault:


```
=> 0x00000008016556df <flockfile+15>:    cmp    QWORD PTR [rbx+0xa0],rax
```

I'll try to play around with it .. didn't use any X program for some time now though ..


----------



## kr0m (Oct 25, 2020)

anonymous9 said:


> Do you use Gnome?  The problem is with glib. Try the link to the bug in my previous post.


I dont use Gnome, i use awesome, i have checked the link and it seems to be related.


----------



## olli@ (Oct 25, 2020)

kr0m said:


> Tested, the only source of problems is the `#/dev/ada2p1        /mnt/6T        ext2fs    rw        0    0` line in my fstab file, i will try to fsck.ext4


I would  regard the ext2fs support in FreeBSD as experimental, especially when mounted read+write. I’m not really surprised that it may cause problems. I recommend to use it in read-only mode, or avoid it altogether if possible.

Another option would be to use the ext2fs FUSE module (from packages / ports) instead of FreeBSD’s own ext2fs driver. It’s less efficient because it runs in userland instead of the kernel, but it might be more reliable.


----------



## _martin (Oct 26, 2020)

I chose geany as a program to debug. Crash occurred *sometimes* when I open an open file dialog (ctrl+o). No actual opening of the file happens. Note program doesn't always crash in flockfile() function. That's just an observation, not any statement though. glib is big-ish and I've never debugged it before.

What is more interesting is the fact it doesn't matter what FS I'm trying to open files on. The only thing that matters is the contents of the /etc/fstab. I was able to use the geany without any problem on an ext4 filesystem when I removed the entry from the fstab (e.g. mount and remove from fstab afterwards).

I was able to trigger the same issue when no ext4 FS was mounted but specified in fstab. I used UFS.

The actual crash seems to be related to g_unix_mount_point_at(). The NULL str is passed sometimes to strcmp(), sometimes it crashes in flockfile() on NULL structure.

geanie is built up from 8 threads. Few threads (pool-geany) are doing the same thing (parsing). It feels like sort of a race condition.

UPDATE:  I've fetched the ports and installed glib from there. Using version glib-2.66.2,1 I can't trigger the bug anymore.


----------



## kr0m (Oct 26, 2020)

I am using packages, so when glib-2.66.2,1 arrives to binary packages, the bug will be solved, isnt it?


----------



## _martin (Oct 26, 2020)

I'd say yes,  you'll get this fixed with the new version of glib.

Also, when you check the fixes that are in port version:


```
------------------------------------------------------------------------
r552776 | fluffy | 2020-10-20 01:33:28 +0200 (Tue, 20 Oct 2020) | 9 lines

devel/glib20: lock getfsent() usage to fix some consumers crashes

Add temporary fix while more correct solution is cooking in GNOME repo
(see details at https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1707)

PR:        250311
Submitted by:    sigsys@gmail.com
Reviewed by:    tijl
```
It is related to the bug provided by anonymous9.


----------

